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Preface 


The market for undergraduate textbooks on quantum mechanics is oversaturated as 
one can literally choose from tens, if not hundreds, of different titles, with several 
of them being accepted as a standard. Writing yet another textbook on this subject 
might appear as a reckless and time-consuming adventure, and whoever got engaged 
in it can be rightfully suspected in arrogance and self-aggrandizement. Be that as 
it may, after 15 years of teaching an undergraduate quantum mechanics course at 
Queens College of the City University of New York, I finally came to realization 
that neither I nor my students are particularly happy with the existing standards. 
It also occurred to me that my approaches to teaching various topics in quantum 
mechanics, along with the lecture notes I have accumulated over these years, could 
form the foundation of a textbook that would be different from those I saw on the 
market. In many cases, students treat their physics textbooks as a reference source 
for formulas and postulates—used to solve problems rather than as actual reading 
material. I, on the other hand, dreamed about writing a book that would actually be 
read; and in order to achieve that, I have devoted a significant amount of time to the 
explanation of the most minute technical details of various technical derivations. 
The level of technical detail in the book would ideally allow students to use it 
without the need for a lecturer’s explanations, allowing professors to use precious 
lecture time on something more productive and fun. But quantum mechanics is not 
just about derivations of formulas (though these might be fun, of course, for the 
mathematically inclined). The physical content of the derived results is immensely 
more important and interesting, if you ask me. Thus, I have included in the text 
extensive qualitative discussions of the physical significance of derived formulas 
and equations. Finally, to make reading even more enjoyable, I tried to preserve the 
informal, colloquial style of my lectures, addressing readers directly and avoiding 
the dry, impersonal manner found in too many formal scientific texts. 

Most physics textbooks present ideas and concepts without more than a passing 
mention of the people who discovered them. Such an approach aims to emphasize 
the objectivity of the laws and principles that physics deals with. The essence of 
these laws does not depend on the personal traits and ideologies of their discoverers, 
which is not always the case in humanities. While I agree that such a cold, formal 
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approach is justified by the objective nature of the laws of physics, I am still not sure 
that it is the best way to present the material to students. This approach dehumanizes 
physics, preventing people from relating to it on a personal level and seeing physics 
as part of the general human experience. In this book, I tried to break out of this 
tradition and introduced some personal details about the lives of those scientists 
who were responsible for developing quantum theory and changing our views of the 
universe along the way. I would like the readers to see that the complex technical and 
philosophical ideas in quantum mechanics were generated by mortal human beings 
with strengths and weaknesses—that they experienced struggles and made mistakes 
for which they had to bear full responsibility. Obviously, it would be impossible to 
talk about all these great (and sometimes not so great) men 1 in detail. But whenever 
possible, I have tried to provide bits and pieces about the personal lives of scientists 
whose names appear in the text. 

Anyone writing a textbook on such an immense subject as quantum mechanics 
always struggles with the question of which topics to include and which to leave out. 
There are, of course, some standard concepts that cannot be avoided, but beyond 
those, the choice is always a function of the author’s personal predilections. These 
predilections led me to include some topics that are not usually covered in under¬ 
graduate textbooks, such as the Heisenberg equations, the transfer-matrix method 
for one-dimensional problems, optical transitions in semiconductors, Landau levels, 
and the Hall conductivity. At the same time, I left out such popular topics as WKB 
and variational methods. The total contents of the book were chosen to satisfy the 
needs of a two-semester course for students who have already been exposed to 
some quantum ideas in a modern physics course or its equivalent. At the same 
time, the book can also be used for a one-semester course. While each instructor 
deciding to adopt this book can choose whatever material they prefer for their one- 
semester course, my suggestion would be to include Chaps. 1 through 9, as well as 
Chaps. 11 and 15, which can be taught almost independently of other chapters of the 
book. (While there are some cross-references between these and other chapters, they 
shouldn’t impede the student’s or instructor’s ability to get through the arguments.) 
Finally, I would like to emphasize that the problems offered for solution by the 
students are an integral part of the text and must be treated as such. 

Quantum mechanics is one of the ultimate triumphs of the human mind. I enjoyed 
writing this book, trying to convey the awesomeness of quantum ideas and of the 
people who contributed to their development, and I hope that you will enjoy reading 


heedless to say, at the time of birth of quantum physics, most of the scientists were, indeed, men. 
Even more remarkable are achievements of a few women who left their marks on the twentieth- 
century physics. Nobel Prize winners Marie and Irene Curie have since become household 
names, but other female scientists such as German mathematician Emmy Noether and Austrian- 
Swedish experimental physicist Lise Meitner also deserve to be remembered. Unfortunately, 
the contributions of Noether, who uncovered the intimate relationship between symmetries and 
conservation laws, and Meitner, who together with Otto Hahn discovered the fusion of uranium, 
are too advanced for an introductory quantum mechanics book, preventing me from talking about 
these scientists in more detail. 
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it. While physics students, unlike their brethren from humanity departments, are not 
accustomed to actually reading their assigned texts, I would like to encourage you 
to overcome the established habits and give this book a chance. Who knows, you 
might like it. 

New York, NY, USA Lev I. Deych 
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1.1 The Rise of Quantum Physics and Its Many Oddities 

In 1890, Scots-Irish physicist Lord Kelvin (bom William Thomson, knighted in 
1866 by Queen Victoria for his work on transatlantic telegraph, ennobled in 1892 for 
his achievements in thermodynamics, and became the first British scientist elevated 
to the House of Lords of the British Parliament) gave his famous speech identifying 
only two “clouds” in the clear sky of classical physics. One of them was the problem 
of luminous ether undetected in the series of experiments carried out between 1881 
and 1887 by Americans Albert Michelson and Edward Morley, and the second 
one was the problem of the black-body radiation. Classical physics predicted that 
the amount of electromagnetic energy emitted by warm bodies increases with a 
decreased wavelength of radiation, making the total emitted energy infinite. This 
problem was dubbed an ultraviolet catastrophe by Paul Ehrenfest in 1911. As it 
turned out, both these little clouds spelled the end of the classical physics: the first 
of them resulted in the special relativity theory, and the solution of the second one 
achieved by German Physicist Max Planck laid the first stone in the foundation of 
quantum physics. 

To explain the radiation of the black bodies, Planck had to introduce unusual 
entities—oscillators whose energy cannot be changed continuously, but must be a 
multiple of an elementary energy quantum hv , where h = 6.55 x 10 -34 J s is a 
fundamental constant of nature introduced by Planck and v is the classical frequency 
of the oscillator. (This expression is often written as fico , where fi (h-bar) is “the 
reduced” Planck’s constant and co is the angular frequency of the oscillator. Given 
that co = 2ttv, you can easily find that fi = h/2n.) Using this assumption and the 
apparatus of classical statistical physics, Planck has derived his famous formula for 
the spectral energy density of the black-body radiation: 
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where u(v, T)dv is the energy density of electromagnetic waves with frequencies 
within the interval [v, v + dv\, k B is the Boltzmann constant, and T is the absolute 
temperature. 1 The original value of the constant h which Planck derived by fitting 
his formula to the experimental data turned out to be quite close to the modem value, 
which is currently believed to be 

h = 6.62607004 x 10“ 34 Js. 


The idea that the energy of a classical particle cannot take an arbitrary continuously 
changing value seemed to Planck quite revolutionary, so much so that he refused 
to believe that his quantized oscillators are related to any real physical objects, 
such as atoms or molecules, thinking of them as of a purely mathematical trick, 
which works. It took Einstein’s theory of photoelectric effect (1905), where 
Einstein explicitly postulated that electromagnetic energy propagates in the form 
of quantized and indivisible portions, his 1907 theory of specific heat in solids, and 
other developments, for Planck to finally declare in 1911 that the hypothesis of the 
quanta reflects physical reality and marks the beginning of a new era in physics. 

From this point on, quantum ideas started rolling down the unsuspected physi¬ 
cists like an avalanche burying underneath the entire classical world of objective 
reality and certainty. Or so it seemed. Einstein opened the can of worms by 
proposing the notion of light quanta, which introduced into the conscience of 
physicists the idea of the wave-particle duality. In 1923 this idea was picked 
up by a young French Ph.D. student Louis de Broglie who in his dissertation 
suggested that not only light, normally thought to be a wave, can behave as a 
stream of particles, but also regular particles—electrons, neutrons, atoms, etc.— 
can manifest wavelike properties. He postulated that Einstein’s relations connecting 
the wave characteristics of light (frequency, v, and wavelength, A) with its particle’s 
characteristics (energy E and momentum p) 


E = hv (1.2) 

p = h/X (1.3) 


! The actual story of this formula and Planck’s contribution to quantum physics is not quite that 
simple. First, the ultraviolet catastrophe apparently did not motivate Planck at all as he did not 
think that it was an unavoidable logical consequence of classical physics. Second, Planck first 
obtained his formula empirically by trying to fit the experimental data and only after that found 
a theoretical “explanation” for it. Third, he did not believe that his quantized oscillators represent 
real atoms or molecules for quite some time and accepted the reality of energy quantization only 
very reluctantly. This story is described in more detail in the article by H. Kragh “Max Planck: the 
reluctant revolutionary” published in December 2000 in Physics World. 
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can be reversed and applied to electrons, protons, and other material particles. A 
significant difference between light and electrons, of course, is that the latter have 
a finite mass and obey a quadratic relation between energy and momentum, E = 
p 2 /2m e , where m e is the mass of the particle, while the former is characterized by 
the linear relation E = pc , where c is the speed of light, following from the relativity 
theory for particles with zero rest mass. This difference as you will see later plays a 
crucial role in quantum theory. 

The revolutionary matter wave idea of de Broglie was confirmed experimentally 
short 4 years later, in 1927, by American physicists Clinton Davisson and Lester 
Germer working at Bell Labs and independently by British scientist George 
Paget Thomson at the University of Aberdeen. Davisson and Germer observed 
diffraction of electrons propagating through crystalline nickel, while Thomson 
studied electrons propagating through a metal film. 2 These achievements resulted 
in the 1929 Nobel Prize for de Broglie and the shared 1937 Nobel prize for 
Davisson and Thomson. So, if you thought that the quantization of energy was a 
revolutionary idea, then the wave-particle dualism must really blow your mind: how 
can something be simultaneously a particle (localized, indivisible, countable entity) 
and a wave (extended, continuous, arbitrarily divisible excitation of a medium)? To 
wrap his mind around this oddity of the quantum world, Danish physicist Niels Bohr 
came up with his famed complementarity principle, which essentially states that all 
experiments that one can conduct with objects of atomic scale can be divided into 
two never-overlapping groups. In the experiments belonging to the first group, the 
material objects reveal their particle-like side, and in the experiments of the second 
group, they present to the world their wavelike quality, and it is impossible to design 
an experiment, in which both sides are manifested together. In the book Evolution of 
Physics by L. Infeld and A. Einstein, this principle was presented in the following 
way: 

But what is light really? Is it a wave or a shower of photons? There seems no likelihood for 
forming a consistent description of the phenomena of light by a choice of only one of the 
two languages. It seems as though we must use sometimes the one theory and sometimes 
the other, while at times we may use either. We are faced with a new kind of difficulty. 

We have two contradictory pictures of reality; separately neither of them fully explains the 
phenomena of light, but together they do. 

Even though this quote refers to the properties of light, it can be repeated almost 
verbatim for the quantum description of any atomic object. Bohr’s own formulation 
of the complementarity as presented in his famous 1927 lecture at a conference in 
the Italian town of Como is somewhat deeper even if more vague when taken out of 
the broader context: 

any given application of classical concepts precludes the simultaneous use of other classical 
concepts which in a different connection are equally necessary for the elucidation of the 
phenomena. 


2 Here is a historical irony for you: George Thomson, who got the Nobel Prize for proving that 
electrons are waves, is a son of J.J. Thomson (not to be confused with W. Thomson—Lord Kelvin), 
a prominent English physicist, who got the Nobel Prize for proving that an electron is a particle. 
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What Bohr is implying here is that by virtue of the macroscopic size of an 
observer (that would be us, humans), any measurement is necessarily conducted 
by a large macroscopic apparatus, which is supposed to be fully describable by the 
laws of classical physics. The results of the measurements, therefore, are interpreted 
in terms of classical concepts such as momentum, position, energy, time, etc. Then 
the complementarity principle poses that one cannot design a single experiment in 
which classical quantities belonging to complementary classes such as momentum 
and position, energy and time, etc. can be determined. Thus measuring a position of 
a quantum object, i.e., trying to localize it at a point in space and time, reveals 
this object’s particle-like characteristics simultaneously destroying any traces of 
its wavelike behavior. The most frequently discussed illustration of this idea is 
a double-slit interference experiment beautifully analyzed, for instance, in the 
famous The Feynman Lectures on Physics , which are now freely available online 
at http://www.feynmanlectures.caltech.edu/, and I encourage you to go ahead and 
peruse them. The scale disparity between observers and atomic objects is what 
makes quantum theory so difficult to “understand”—our vocabulary developed to 
reflect the macroscopic world of our own scale fails when we try to apply it to the 
world of atoms and electrons. This is how Richard Feynman describes it: 

Because atomic behavior is so unlike ordinary experience, it is very difficult to get used 
to, and it appears peculiar and mysterious to everyone—both to the novice and to the 
experienced physicist. Even the experts do not understand it the way they would like to, 
and it is perfectly reasonable that they should not, because all of direct, human experience 
and of human intuition applies to large objects. We know how large objects will act, but 
things on a small scale just do not act that way. So we have to learn about them in a sort of 
abstract or imaginative fashion and not by connection with our direct experience. 3 

In 1927 Heisenberg found himself locked in the battle of ideas with Austrian 
physicist Erwin Schrodinger, the author of the famed Schrodinger equation, who 
took the de Broglie matter wave idea close to his heart and introduced a wave 
function satisfying a simple differential equation, which he tried to interpret quite 
literally as a quantity representing an actual electron smeared over some small but 
finite region of space. It differed strongly from Heisenberg’s approach based upon 
the idea of “quantum jumps” introduced in Bohr’s model of a hydrogen atom (I 
am sure you heard about Bohr’s quantization postulates and his planetary model of 
an atom so I do not have to reproduce it here), which he described using abstract 
algebraic quantities later found by Heisenberg’s collaborators and compatriots Ernst 
Pascual Jordan and Max Born to be matrices. Schrodinger’s interpretation of the 
matter waves was very appealing, especially to older physicists, because it preserved 
most of the classical world view—continuous evolution of electron’s charge density 
in space-time with no incomprehensible discontinuities introduced by quantum 
jumps. You might be amused by the following exchange between Schrodinger and 
Bohr that took place during Schrodinger’s visit to Copenhagen in October of 1926 4 : 


3 R. Feynman, R. Leighton, M. Sands, Feynman Lectures on Physics, vol. 1, Ch. 37, online edition 
available at http://www.feynmanlectures.caltech.edu/. 

4 Quoted from the book by W. Moore, Schrodinger. Life and Thought (Cambridge University Press, 
Cambridge, 1989). 
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Schrodinger: “You surely must understand, Bohr, that the whole idea of quantum jumps 
necessarily leads to nonsense. ... the electron jumps from this orbit to another one and 
thereby radiates. Does this transition occur gradually or suddenly? If it occurs gradually, 
then the electron must gradually change its rotation frequency and energy. It is not 
comprehensible how this can give sharp frequencies for spectral lines. If the transition 
occurs suddenly, in a jump so to speak,... one must ask how the electron moves in a jump. 
Why doesn’t it emit a continuous spectrum? And what laws determine its motion in this 
jump?” 

Bohr: “Yes, in what you say you are completely right. But that doesn’t prove that there 
are no quantum jumps. It only proves that we can’t visualize them, that means that the 
pictorial concepts we use to describe the events of everyday life and the experiments of 
the old physics do not suffice also to represent the process of a quantum jump. That is not 
surprising when one considers that the processes with which we are concerned here cannot 
be the subject of direct experience ... and our concepts do not apply to them”. 

Schrodinger: “I do not want to get into a philosophical discussion with you about the 
formation of concepts ... but I should simply like to know what happens in an atom. It’s all 
the same to me in what language you talk about it. If there are electrons in atoms, which 
are particles, as we have so far supposed, they must also move about in some way. At the 
moment, it’s not important to me to describe this motion exactly; but it must at least be 
possible to bring out how they behave in a stationary state or in a transition from one state 
to another. But one sees from the mathematical formalism of wave or quantum mechanics 
that it gives no rational answer to this questions. As soon, however, as we are ready to 
change the picture, so as to say that there are no electrons as particles but rather electron 
waves or matter waves, everything looks different. We no longer wonder about the sharp 
frequencies. The radiation of light becomes as easy to understand as the emission of radio 
waves by an antenna... ” 

Bohr: “No, unfortunately, that is not true. The contradictions do not disappear, they are 
simply shifted to another place... Think of the Planck radiation law. For the derivation 
of this law, it is essential that the energy of the atom have discrete values and change 
discontinuously ... You can’t seriously wish to question the entire foundation of quantum 
theory.” 

Schrodinger: “Naturally, I do not maintain that all these relations are already completely 
understood ... but I think that the application of thermodynamics to the theory of matter 
waves may eventually lead to a good explanation of Planck’s formula” 

Bohr: “No, one cannot hope for that. For we have known for 25 years what the Planck 
formula means. And also we see the discontinuities, the jumps, quite directly in atomic 
phenomena, perhaps on the scintillation screen or in a cloud chamber ... You can’t simply 
wave away these discontinuous phenomena as though they didn’t exist.” 

Schrodinger: “If we are still going to have to put up with these damn quantum jumps, I 
am sorry that I ever had anything to do with quantum theory.” 

Bohr: “But the rest of us are very thankful for it—that you have—and your wave 
mechanics in its mathematical clarity and simplicity is a gigantic progress over the previous 
form of quantum mechanics.” 

By the end of this discussion, Schrodinger fell ill and for a few days had to stay as a 
guest in Bohr’s house where Bohr’s wife, Margrethe, was taking care of him. In the 
same year Schrodinger wrote: 5 

My theory was inspired by L de Broglie ... and by short but incomplete remarks by A 
Einstein. ... No genetic relation whatever with Heisenberg is known to me. I knew of his 
theory, of course, but felt discouraged not to say repelled, by the methods of transcendental 
algebra, which appeared very difficult to me and by the lack of visualizability. 


5 E. Schrodinger, On the relationship between the Heisenberg-Born-Jordan quantum mechanics and 
mine. Ann. Phys. 70 , 734 (1926). 
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Heisenberg fully understood the weakness of the matrix theory, writing in his 1925 
paper, coauthored with Born and Jordan 6 : 

Admittedly, such a system of quantum-theoretical relations between observable quantities 
... would labor under the disadvantage that there can be no directly intuitive geometrical 
interpretation because the motion of electrons cannot be described in terms of the familiar 
concepts of space and time. 

The popularity of Schrodinger’s matter waves not only spelled for him purely 
scientific troubles as a competing theory but also jeopardized his career: he was 
ripe to search for a permanent university position, and fondness of professors in 
charge of the academic appointments for Schrodinger’s views created problems for 
Heisenberg. Thus, in 1927, Heisenberg undertook concentrated efforts to remedy 
the perceived weaknesses of his approach to quantum theory, which resulted in the 
celebrated Uncertainty Paper, 7 where his now famous uncertainty principle 


AxAp > | (1.4) 

was shown to be an unavoidable consequence of the inner structure of his quantum 
formalism (Ax and A p are loosely defined “uncertainties” of the position and 
momentum). An equally important portion of the paper was devoted to expounding 
the connections between the formalism and the real world. Heisenberg made a point 
to emphasize that in the end the goal of the theory is to predict values of those 
quantities, which can be observed by a clearly defined physical process. Considering 
operational and physical details of an observation of the coordinate, Heisenberg 
showed that it is impossible to carry out such an observation without disturbing the 
velocity of the electron, and therefore, such concepts as trajectory or continuous 
time dependence of a particle’s coordinate shall have no place in the theory of 
the electron’s properties. He illustrated these ideas with a thought experiment 
involving observation of an electron using a gamma-ray microscope. The idea of the 
experiment is that in order to actually observe (“see”) a position of the electron, one 
has to shine light on it and detect the reflected (or scattered if you prefer) rays using 
a microscope. The accuracy in measuring the position is limited by the microscope’s 
resolution (electron can be anywhere within a region resolved by the microscope), 
which is proportional to the wavelength of light: the shorter the wavelength, the 
smaller region of space it is able to resolve. Thus to determine the precise position 
of the electron, one must use light of a very short wavelength (hence, the gamma- 
ray microscope in Heisenberg’s paper). The problem is, however, that light of very 
short wavelength has very large momentum (see Eq. 1.3), which it transfers to the 
electron changing its momentum by the amount which is inversely proportional to 
the wavelength. The precise value of the electron’s new momentum and its direction 


6 M. Bom, W. Heisenberg, P. Jordan, Zeitschr. Phys. 35, 557 (1926). 

7 W. Heisenberg, The physical content of quantum kinematics and mechanics. Zeitschr. Phys. 43, 
172 (1927). 
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are completely unknown, and, therefore, trying to improve our knowledge about 
electron’s position, we destroy our ability to know its momentum. 

Heisenberg’s uncertainty principle and the idea that it is the process of mea¬ 
surement that defines what can be known about the system contributed to Bohr’s 
thinking about his complementarity principle. Taken together with the uncertainty 
principle and Born’s interpretation of the results of quantum measurements in terms 
of probabilities, 8 it eventually resulted in the complete mathematical and conceptual 
framework of quantum theory known as the Copenhagen interpretation. According 
to this interpretation, the observable quantities describing a quantum system do 
not have any definite value before they are actually measured and can randomly 
take one of the allowed values only after a measurement is performed. The act of 
measurement abruptly changes the system bringing it in the state corresponding to 
the realized value of the measured quantity. 

Not everyone was happy with the Copenhagen interpretation, including one of 
the originators of the quantum revolution, Albert Einstein. He couldn’t reconcile 
himself with the loss of the classical determinism—the idea that the true laws of 
nature must provide us with complete and precise information about all essential 
characteristics of material objects and that given the right tools, we can always 
experimentally measure them. Einstein wrote to Max Born in 1926 9 : 

Quantum mechanics is certainly imposing. But an inner voice tells me that it is not yet the 

real thing. The theory says a lot, but does not really bring us any closer to the secret of the 

“old one.” I, at any rate, am convinced that He does not throw dice. 

“He” in this context is a direct reference to the Creator, who appears explicitly 
in a catchier and better known expression of the same idea recorded in the book 
Einstein and the Poet by William Hermann: “God does not play dice with the world.” 
Leaving aside the issue of religious beliefs of Einstein, you need to understand 
why this particular feature of the quantum theory bothered him that much. After 
all, Einstein had no objections to using probability in classical statistical physics 
and successfully used probabilistic concepts himself in his work on Brownian 
motion. The difference of course comes from the fact that the probabilities in 
classical statistical physics reflect the objective lack of information about positions 
and velocities of individual molecules, and in the absence of this information, the 
language of the distribution functions and probabilities is the best way to deal 
with this situation. In the Copenhagen interpretation of quantum mechanics, the 
statistical uncertainty was made inherent to the nature of the universe, and Einstein 
had difficulty with this idea. For many years, Einstein and Bohr participated in 
public discussions and exchanged letters, in which Einstein tried to find a counter¬ 
example to the Copenhagen interpretation and Bohr debunked all of them. These 
exchanges between Einstein and Bohr are the finest example of the scientific oral 
and epistolary debates. 


8 M. Bom, On the quantum mechanics of collisions. Zeitschr. Phys. 37, 863 (1926). 

9 Letter to Max Born (4 December 1926); The Born-Einstein Letters (translated by Irene Bom) 
(Walker and Company, New York, 1971). 
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In this book, I will rely on the traditional Copenhagen interpretation of quantum 
theory and will present it taking the “bull by the horns”: in spirit of Heisenberg’s 
views that only those quantities which can be really observed in a particular set of 
experiments shall be used to describe quantum systems, I will start with the abstract 
concept of the quantum state defined by a set of such observable quantities. Then I 
will place this concept within a mathematical structure of a linear vector space and 
will proceed from there. The Schrodinger wave function will appear much later in 
the text (as compared to most other undergraduate-level textbooks) as one of many 
alternative and equivalent ways of describing the states of quantum systems. 

However, before plunging into the depth of the quantum formalism and its 
application, I would like to provide you with the historical background on the times 
when quantum theory was born. While it will not help you understand the physics 
better, it will help you see that advances in physics were not an isolated incident but a 
part of a general trend in the movement of humanity toward modernity. You will also 
be able to appreciate that people responsible for the birth of quantum theory are not 
just abstract great names—they are real people with different views on life, different 
political preferences, and moral principles, people forced to make difficult choices 
outside of their professional lives and bearing responsibility for their choices. 


1.2 Brief Overview of the Historical Background 

The history of quantum theory from earlier childhood to maturity covers the time 
period between 1900 and 1930 and involves such countries as Germany, Austria, 
France, Italy, Denmark, the UK, and the USA. During this time, not only the 
facade of classical physics crumbled, but the entire world changed drastically. 
It was the time of great upheavals and great discoveries, unendurable misery, 
and unbeatable achievements in science, architecture, arts, music, and literature. 
The period between 1900 and 1914 was the time of inter-European and even 
transcontinental integration (or what we would call now globalization) with free 
flow of goods, people, and capital across the borders and the time of liberalization 
and democratization, when even monarchic regimes of Germany, Austria-Hungary, 
and Russia provided more political rights and more freedom to their citizens. The 
liberalization and the growth of wealth that accompanied it were beneficial for 
developments in science, arts, music, and literature. Physics was not an exception— 
the free travel between countries and free exchange of ideas created fertile ground 
for new developments, which included relativity theory and quantum mechanics. 

At the same time, the wealth and prosperity weren’t shared by all, and the 
socialist ideas started attracting a significant number of followers. The Austro- 
Hungarian Empire proved to be the less stable and vulnerable to the rising 
nationalism of smaller nations comprising it. On the surface, the everyday life of 
people in European countries appeared to follow the normal “business like usual” 
path; underneath of this placidity the tensions were rising, especially in the Balkans. 
This was the calm before the storm as everything came crushing down on June 
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28, 1914, with the assassination of the heir to Austro-Hungarian throne, Archduke 
Franz Ferdinand, at the hands of a Serbian teenage nationalist. On July 28 Austria 
declared war on Serbia, and on August 1 and 3, Germany declared war on Russia 
and France correspondingly and invaded Belgium. On August 4 Britain declared 
war on Germany and World War I was in full swing. 

In response to the war, the nationalist and militarist fever was rising even 
among intellectuals: scientists, artists, and poets. Such prominent German physi¬ 
cists as Nobel Prize winners Max Planck, Philipp Lenard (photoelectric effect), 
Walther Nernst (third law of thermodynamics), Wilhelm Roentgen (X-rays), and 
Wilhelm Wien (black-body radiation) signed the infamous Manifesto of 93 declaring 
unequivocal support of the German occupation of Belgium, the actions known in 
history as the Rape of Belgium. The war devastated Europe: 18 million died (11 
million military personnel and 7 million civilians), and 23 million were wounded. 
This war witnessed the first widespread use of chemical weapons on the battle field, 
most effectively by the German Army but also by the Austrians, French, and British. 
The chemical terms chlorine and mustard gas became household names. 

During the war, not surprisingly, there was a lull in the development of physics in 
general and quantum theory in particular as many scientists on all sides participated 
in the war efforts. In addition to direct deaths, suffering, and destruction, the war 
started a chain of events that led to the even greater horrors of World War II. 
The Austrio-Hungarian Empire disintegrated, leaving in its wake a multitude of 
smaller countries in Central and Eastern Europe (Austria, Hungary, Yugoslavia, 
Czechoslovakia, Poland); the Russian Empire went through the bloody “Bolshevik” 
revolution, which gave birth to a cruel totalitarian regime under the guise of the 
“Dictatorship of Proletariat.” Germany lost the war and was forced to sign the 
harsh Versailles peace treaty. By the terms of the treaty, Germany and her allies 
took all responsibility for all the losses and damages during the war, had to pay 
heavy reparations, and lost significant territory, especially Alsace-Lorraine, and 
output of coalmines of Saar to France, Upper Silesia, Eastern Pomerania, and part 
of Eastern Prussia to Poland. The treaty left Germany humiliated and fostered 
feelings of resentment and a desire for revenge among large segments of the 
German population. 

The German Empire disintegrated and was replaced by the weak and ineffective 
Weimar Republic, which survived until 1933, when Hitler, newly appointed as 
Chancellor, suspended democratic procedures and declared himself Fuhrer of the 
Third German Reich. In Germany and to a lesser extent in Austria, the period 
of Weimar Republic was the time of high unemployment, hyperinflation, and 
intense and often violent political struggles between socialists, liberal democrats, 
communists, conservatives, and national socialists. But it was also the time of 
unprecedented explosion of creativity in science, architecture, literature, film, 
music, and arts. Berlin became a thriving cosmopolitan city, the center of attraction 
for artistic and scientific elites. In addition to quantum theory, this time produced 
significant advancements in nuclear physics and radioactivity, and engineering, but 
also gave rise to eugenics and radical racial theories. 
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Fig. 1.1 From back to front and from left to right: Auguste Piccard, Emile Henriot, Paul 
Ehrenfest, Edouard Herzen, Theophile de Donder, Erwin Schrodinger, Jules-Emile Verschaffelt, 
Wolfgang Pauli, Werner Heisenberg, Ralph Howard Fowler, Leon Brillouin, Peter Debye, Martin 
Knudsen, William Lawrence Bragg, Hendrik Anthony Kramers, Paul Dirac, Arthur Compton, 
Louis de Broglie, Max Born, Niels Bohr, Irving Langmuir, Max Planck, Marie Sklodowska 
Curie, Hendrik Lorentz, Albert Einstein, Paul Langevin, Charles-Eugene Guye, Charles Thomson 
Rees Wilson, Owen Willans Richardson 


Quantum theory reached its maturity between 1923 and 1930, thanks to the 
resumption of international contacts and free flow of information. A big role in 
fostering the progress was played by Solvay conferences that took place in Brussels 
and financed by Belgian industrialist Ernest Solvay. The first two conferences took 
place in 1911 and 1913 and resumed in 1921 after an 8-year interruption due to 
the war. These conferences were attended by all the main players in physics and 
chemistry of the times. The photograph above (Fig. 1.1) shows the attendees of the 
fifth conference that took place in 1927 and was a culmination of a struggle between 
Bohr and Einstein’s views on the interpretation of quantum mechanics. Bohr won, 
and from that time on, the Copenhagen interpretation dominated physicists thinking 
about quantum theory. 

By the end of the 1920s, the economic situation in Germany started improving, 
but the market crash of 1929 and the start of Great Depression in the USA 
interrupted the recovery. The economics of Weimar Republic fell into the abyss 
facilitating the rise of Nazis to power in 1933. The intellectual atmosphere in 
Germany and Austria had already begun to deteriorate in the beginning of the 1930s 
with the rise of the movement for Aryan physics spearheaded by such luminaries 
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as Philipp Lenard and Johannes Stark, but after 1933 the situation for German 
physicists of Jewish origin became intolerable. Albert Einstein visited the USA in 
1933 and never returned; the same year, Max Born was suspended from his position 
at the University of Gottingen and immigrated first to England and then to the USA. 
Nazis drove away Jewish scientists not only from Germany and Austria but also 
from all of Central and Eastern Europe. Here is the incomplete list of the refugee 
physicists: Hans Bethe, John von Neumann, Leo Szilard, James Franck, Edward 
Teller, Rudolf Peierls, and Klaus Fuchs. Enrico Fermi, whose wife was Jewish, fled 
fascist Italy. Ironically, while trying to preserve the racial purity of their science, 
Nazis destroyed German predominance in physics (as well as in other areas of 
intellectual and artistic pursuit) and made America into a science powerhouse. 

Not only physicists of Jewish origin bore the wrath of the Nazis in Germany and 
Austria. Erwin Schrodinger, who was known to oppose Nazism, was ordered not 
to leave the country after Hitler declared Anschluss (union) between Germany and 
Austria in 1938. Luckily, he managed to escape to Italy, from where he moved to the 
UK and finally settled in Dublin as the Head of the Institute for Advanced Studies. 
During this time, he tried to develop a unified field theory, but his most important 
work of this period was the book What Is Life , where he introduced the idea that 
complex molecules can contain genetic information. He returned to Vienna in 1956. 

However, not all physicists fled, and some even joined the National Socialist 
Party. Among the most prominent Nazi physicists was the aforementioned Philipp 
Lenard, who contributed to the discovery of photoelectric effect and won the Nobel 
Prize in 1905. He was an ardent anti-Semite and dismissed Einstein’s works as 
“Jewish science.” Lenard lived through the war and was demoted from his emeritus 
position at Heidelberg University in 1945 by Allied forces. 

Especially sad is the case of Bom’s student and collaborator Pascual Jordan. 
He joined the Nazi party in 1933 and even became the member of its SA unit 10 
and enlisted in Luftwaffe (German Air Force) in 1939. During World War II, he 
attempted to interest party officials in various weapon schemes but was deemed 
politically unreliable. Indeed, to his honor, he refused to condemn Einstein and 
other Jewish physicists. After the war, Wolfgang Pauli interceded on Jordan’s behalf 
declaring him rehabilitated. This allowed Jordan to continue his academic career 
and even secure a tenured position in 1953. Still, flirting with Nazism costed him 
the Nobel Prize, which he would have probably shared with Bom. 

I also feel obliged to mention that some German physicists who remained in 
Germany during the Nazi era, while not actively opposing the regime (which would 
be akin to signing a death sentence), still behaved in a very noble way. Arnold 
Sommerfeld, who was nominated 84 times for the Nobel Prize, but never won, and 
was among the earlier contributors to quantum theory, was an admirer of Einstein 
and made special efforts to help his Jewish students and assistants, such as Rudolf 
Peierls and Hans Bethe, find employment outside of Germany. Another patriarch 


10 Sturmabteilung, or SA—the original paramilitary wing of the Nazi party—played a significant 
role in Hitler’s rise to power. 
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of German science, Otto Hahn, who made significant contributions to physics and 
chemistry of radioactivity and won the 1944 Nobel Prize, was an opponent of 
national socialism. Einstein wrote that Hahn was “one of the very few who stood 
upright and did the best he could in these years of evil.” For instance, he fostered a 
longtime collaboration with an Austrian physicist with Jewish roots, Lise Meitner. 
In 1938, Hahn helped her escape from Berlin to the Netherlands, giving her his 
mother’s diamond ring to bribe the frontier guards if needed. After the war, Hahn 
became the founding president of the Max Planck Society in the new Federal 
Republic of Germany and one of the most respected citizens of the new country. 

Werner Heisenberg, on the other hand, found himself in a very difficult situation. 
While not a Nazi, he thought of himself as a German patriot and believed that Hitler 
was a necessary evil to save Germany. So, he stayed put during the Nazi period 
justifying it by the desire to preserve German physics. He also agreed to lead the 
Nazi nuclear program. Obviously, his relationship with former friends became very 
strained. It is known that he visited Bohr in 1941 while Denmark was under Nazi 
occupation. The content of the meeting remains a mystery, and Bohr refused to 
provide his account of what transpired, but the meeting did not go too well and Bohr 
was visibly upset. Bohr wrote down his account of the meeting, but it was sealed 
in his personal papers by the decision of his family. The mystery of this meeting 
inspired an award-winning play Copenhagen by Michael Frayn, and of the film by 
the same name, where the role of Heisenberg was played by Daniel Craig (future 
James Bond in the last three installations of the series: Casino Roy ale. Quantum of 
Solace , and Sky fall). I personally found Craig very convincing as Heisenberg, even 
when he talked about scientific matters. After the war, Heisenberg was cleared of 
accusations in Nazi collaboration in the course of the denazification process and 
was allowed to continue his scientific career. However, his behavior during the Nazi 
time isolated him from other European and American physicists. 
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2.1 Classical and Quantum States 

The task of any physical theory is to develop the means to predict the results of 
the measurements of a physical quantity sometime in the future based upon the 
information about the current state of the system under study and knowledge about 
interaction of this system with its environment. The term “state” has many different 
uses in physics—it is used to describe different states of matter (solids, liquids, 
etc.)—in thermodynamics we derive various equations of state, relations between 
various thermodynamic parameters. We can also talk about a particular state of a 
system, meaning specific values of a set of quantities important for a problem at 
hand. In this book I will consider, for the most part, systems consisting of a small 
number of particles, placed in a variety of different environments. The states, which 
I will be dealing with here, are “mechanical states,” but the precise meaning of this 
term depends upon the choice of a conceptual framework, classical or quantum, with 
which the problem is approached. 

In classical physics a mechanical state is most completely described by specify¬ 
ing coordinates and velocities of the particles at any given time. As Laplace said: 

Given for one instant an intelligence which could comprehend all the forces by which nature 
is animated and the respective positions of the beings which compose it, if moreover this 
intelligence were vast enough to submit these data to analysis, it would embrace in the same 
formula both the movements of the largest bodies in the universe and those of the lightest 
atom; to it nothing would be uncertain, and the future as the past would be present to its 
eyes. 

In modern language it means that acting forces and initial coordinates and velocities 
determine future positions of all bodies in the universe with any accuracy limited 
only by the accuracy of the available experimental and computational instruments. 
Mathematically, the evolution of classical states of a single particle is described 
by ordinary differential equations of the second order, where given values of 
coordinates and velocities form a set of initial conditions necessary to find their 
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unique solution. I am ignoring here complications arising in the cases of nonlinear 
chaotic dynamics, when a uniquely existing solution becomes unstable with respect 
to small variations in initial conditions, so that practical predictability of the 
equations of motion is lost. These situations are excluded from consideration in 
this book. 

Finding states of a classical particle can be significantly simplified if the system 
under consideration possesses conserving quantities, called integrals of motion, 
such as energy or angular momentum. These quantities do not change in the course 
of the time evolution of the system, and, therefore, their values are determined by 
the initial conditions. Knowledge of these quantities makes solving the equations 
of motion easier: for instance, a differential equation of the second order can be 
reduced to the equation of the first order, or a three-dimensional problem can be 
converted into its one-dimensional equivalent. At the same time, while simplifying 
the technical task of solving equations of motion, the existence of integrals of 
motion does not change the fundamental nature of the classical description of the 
system. 

In a quantum world, we are forced to give up the ability to have complete 
knowledge of all desirable parameters (coordinates, velocities, energy, etc.) and, 
therefore, have to redefine the meaning of term “state.” To develop the concept of 
quantum state, I first introduce the idea of an observable defined as any quantity 
whose numerical value can be experimentally measured. The list of possible 
observables is essentially the same as the list of classical parameters: it can include 
coordinates, momentums, energies, angular momentums, etc. The main difference 
between quantum and classical descriptions appears, however, when you recognize 
that in the quantum world, according to the complementarity principle discussed 
in the Introduction, not all observables can be measured within the same set of 
experiments. For instance, reinterpreting the Heisenberg’s uncertainty principle, 
you may say that if a system is in a state with precisely known coordinate, 
its momentum cannot be prescribed any definite value and vice versa—if the 
system is in the state with precisely known momentum, its coordinate remains 
completely undefined. However, you can imagine that there might exist more than 
one observable, whose values can be found with certainty for the same state of 
the system (an obvious example of such observables is three components of the 
position vector or three components of the vector of momentum). Such observables, 
called mutually consistent, play an important role in quantum theory. The largest set 
of such observables is called a complete set of mutually consistent observables. 
Mutually consistent observables often correspond to classical integrals of motion 
such as energy, angular momentum, etc., which in quantum theory are of much 
greater importance than in classical mechanics. The complete set of mutually 
consistent observables is all the information which we can have about a quantum 
state of a system, and therefore, it seems reasonable to define a quantum state simply 
as a collection of values of these observables. 

To proceed I need to translate the words into some kind of mathematical formulas 
or at least something which looks like mathematical formulas. So, embrace yourself 
as I am going to hit you with some heavy formal stuff. Let’s say that the complete 
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set of compatible observables contains N max elements {g^}, 1 < i < N max . Let q® 
represent a /c-th value of i -th observable, and let us assume that any given observable 
q L can either change continuously or take a discrete set of values. In the former 
case, we say that the observable has a continuous spectrum, while in the latter case, 
observables are said to have a discrete spectrum. A British scientist Paul Dirac (I will 
tell you more about him later in the book), one of the founding fathers of quantum 
theory who shared the 1933 Nobel Prize in Physics with Schrodinger, suggested a 
rather picturesque and intuitively appealing way of representing a quantum state as 

( 2 . 1 ) 

For now this is just a fancy notation, but as I will continue to develop quantum 
formalism, its utility will become more and more apparent. Information contained 
in Eq. 2.1 is quite transparent: this expression tells us that if the system is in the state 
given by Eq. 2.1 and we measure observable q®, the result of the measurement will 
be value q®. States, in which at least one of the observables has different values, 
are mutually exclusive, in a sense that if you repeat the measurement over and over, 
you will never observe two distinct values of the same observable as long as the 
measurement is performed on the system in the state described by Eq. 2.1. 

The states of the type presented in Eq. 2.1, in which all mutually consistent 
observables have definite values, are the simplest, but not the only possible states 
of quantum systems. Actually, as you already know, the whole brouhaha about 
quantum mechanics and Einstein’s rejection of it was because of the fact that 
frequently observables would not have definite values, so that one could not predict 
with certainty an outcome of a measurement. If a measurement is performed 
on an ensemble of identical systems, all placed in such a state (same for all 
systems), different members of the ensemble would generate different outcomes 
in an unpredictable manner. A similar situation arises in the case of repeating 
measurements on the same system provided it is returned back to its initial state after 
each measurement. From a theoretical point of view, states with uncertain outcomes 
of a measurement can be described as a linear superposition of the simple states as 
discussed in the next section. 


2.2 Quantum States and Hilbert Vector Space 
2.2.1 Superposition Principle 

Experimental evidence of existence of quantum superposition states comes from 
observations of interference of electrons or neutrons described in any textbook 
dealing with the basic ideas of quantum mechanics. To explain the connection 
between interference and superposition, we usually draw on an analogy with 
electromagnetic waves, where interference is a result of the addition of two spatially 
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overlapping coherent waves. Depending upon the phase difference between the 
waves at any particular point in space, one can observe either bright interference 
fringes (constructive interference resulting from the phase difference 2 rcn, where n 
is an integer) or dark fringes (destructive interference resulting at points where the 
phase difference is (2 n+ Thus, the argument goes, the interference of quantum 
particles must also result from the superposition of something having wavelike 
properties. The particle-wave duality was briefly described in the Introduction, and 
it is also discussed in lots of other textbooks including the already quoted The 
Feynman Lectures on Physics , so I will not dwell any longer on that. I will, however, 
use the existence of quantum interference to justify introducing superposition of the 
states defined in Eq. 2.1. While interference experiments provide convincing but 
indirect evidence for quantum superposition, recently, superposition states of few 
photons and atoms have been observed directly. 

Let me assume that a quantum system can be in one of two states \q\) and | qj) 
characterized by different values q\ and q 2 of some observable, q , with a discrete 
spectrum. According to the superposition principle, this system can also be in a state 
formed by a linear superposition of these two states: 

| superposition) = a\ \q\) + <22 ^ 2 ) (2.2) 

where a\ and 02 are, in general, complex numbers. Equation 2.2 can be interpreted 
verbally by saying that in order to form a superposition state \superposition) , one 
must “multiply” each of the states \q\) and |^ 2 ) by complex numbers a\ and < 22 , 
respectively, and then “add” the results. The problem here is, of course, that since 
we have no idea about the mathematical nature of the object I call “quantum state” 
(is it a function, or a number, or some other mathematical object?), I cannot actually 
tell you how to perform operations I called multiplication and addition and what 
they actually mean (this is why I surrounded “add” and “multiply” by the quotation 
marks). Is it possible to make sense of Eq. 2.2 without first assigning some more 
concrete mathematical meaning to these “quantum states”? It might surprise you, 
but the answer is yes, it is possible, and mathematicians do this all the time. It is not 
really necessary to know either the mathematical nature of the state objects or the 
meaning of algebraic operations we want to perform with them. All what we need 
is to postulate that these objects and operations exist and possess certain properties. 
If this sounds too abstract for you, consider this. Modern object-oriented computer 
languages such as C++ are based exactly on this idea—they introduce an “object,” 
which can be anything (a number, a matrix, a word, a figure), and define various 
operations with them such as addition, multiplication, etc. All these operations have 
well-defined properties, but their concrete meaning depends upon an object to which 
they are applied. Thus, symbol “+” between two numbers means one thing, while 
the same symbol between words or matrices mean something completely different, 
but it still is the same operation because it has the same basic properties. 


2.2 Quantum States and Hilbert Vector Space 


19 


2 . 2.2 Linear Vector Spaces 


I am afraid that now I have to get a bit more abstract with you than you would have 
probably liked, but do not get too frustrated about it. After all you probably have 
passed a bunch of calculus classes taught by math professors forcing you to learn 
proofs of all those theorems of existence and uniqueness. The stuff I am feeding you 
here is just small potatoes compared to that. Anyway, I will begin by postulating that 
all quantum states can be represented by special objects belonging to a certain class 
or “space” defined in such a way that if objects given by Eq. 2.1 belong to this 
“space,” then the objects defined by Eq. 2.2 also belong to it. (Note that the word 
“space” here has a completely different meaning than in our everyday language or 
even in the language of the introductory physics courses and replaces such words as 
class or set of objects with special properties.) I will also assume that there exists a 
null object |0) such that \q) + |0) = \q),0-\q) = |0) (where 0 in the last expression 
is just a number zero unlike |0), which is used to designate the null state object, and 
a dot • means multiplication by a number). I will also postulate a few distributive 
and associative properties (ditching the dot • for the sake of compactness and out of 
habit) such as 


a (ki> + \qi)) = a\qi) +a \q 2 ) 

(2.3) 

a\ ki) + a 2 ki} = {a x + a 2 ) ki) 

(2.4) 

a\ (a 2 ki)) = (a\a 2 ) \q t ) 

(2.5) 


which allows carrying out standard algebraic operations with the quantum state 
objects. 

A familiar example of objects, which possess all the properties specified in 
Eqs. 2.2-2.5 provided that the coefficients < 2 ; are confined to the set of real numbers, 
is given by the usual three-dimensional vectors, such as displacement, or velocity 
vectors used in elementary mechanics. All operations used in Eqs. 2.2-2.5 have in 
this case specific definitions: multiplication by a number is defined as multiplication 
of a vector’s length by a number while keeping its direction intact for positive 
coefficients and reversed for negative coefficients, and the addition of vectors is 
defined by a triangle or parallelogram rule. It is a matter of simple geometry and 
algebra to prove the distributive and associative properties of these operations as 
presented in Eqs. 2.3-2.5. 

Abstract objects satisfying the abovementioned quantities are also called vectors 
and are said to belong to a linear vector space. I will use notation based on Greek 
letters such as \a), |/3), etc. to represent generic elements of the vector space (not 
necessarily vectors based on values of certain physical observables). Even though 
these abstract vectors are quite different from what you are used to calling vectors 
in classical mechanics (e.g., they do not point to any direction in our regular 
three-dimensional space and can have infinitely many components, represented by 
complex numbers), I am going to use some of what you know about regular vectors 
to help you infer additional properties of our abstract vector objects. 
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For instance, you know that regular three-dimensional vectors can be presented 
as a sum of three mutually perpendicular unit vectors, whose directions are 
predetermined by an arbitrary choice of three coordinate axes (X , Y, and Z). These 
unit vectors form what is called a basis in the three-dimensional space—a set 
of vectors which cannot be expressed as linear combinations of one another but 
can be used to present any other vector in this space as their linear combination: 
A = A x e x + A y e y + A z e z , where e x ,e y ,e z are the unit vectors in the direction of 
X , Y, and Z axes, respectively. In order to expand the idea of basis to the arbitrary 
space of abstract linear vectors, I first have to acquaint you with the concept of 
linear independence of vectors. A set of vectors \q\) , |# 2 ), • • • \<1n) is called linearly 
independent if none of the vectors in the set can be presented as a linear combination 
of others. A set of linearly independent vectors is complete if adding any other 
distinct vector to the set makes it linearly dependent. The number of linearly 
independent vectors in the complete set determines the dimension of the space. This 
number does not need to be finite—there are spaces with infinite number of linearly 
independent vectors. From the definition of the complete set of linearly independent 
vectors, it follows that any other vector in a given space can be presented as their 
linear combination: 



( 2 . 6 ) 


where summation is over all vectors in the complete set. Such a set of vectors is 
called a basis in a given space. Apparently, representation of an arbitrary vector 
in the form of Eq. 2.6 is a formal generalization of the physical superposition 
principle. I can also identify the set of states characterized by different values 
of the complete set of mutually consistent observables with a set of linearly 
independent vectors. Indeed, these states are mutually exclusive and correspond to 
certain values of the corresponding observables so that they cannot be presented 
as a superposition since the superposition generates states with uncertain values of 
the corresponding observables. In addition, since I am assuming that I am dealing 
with a complete set of consistent observables, I cannot add any more states to the 
set, which means that the set is linearly independent and complete, i.e., can be 
considered as a basis. Starting with a basis based on the complete set of mutually 
consistent observables, I can, in principle, form such linear combinations, which 
would remain linearly independent among themselves even though they would not 
any longer correspond to definite values of any physical observables. From a purely 
mathematical standpoint, this set still forms a basis, so that the choice of the basis is 
not unique. 

In addition to generalizing the concept of the basis, I can use the example of 
three-dimensional geometrical vectors to introduce one more operation involving 
vectors. You, of course, know that two three-dimensional vectors can be combined 
to form a “dot” or “scalar” product, which plays an important role in physics. 
Consider, for instance, the vector of force F and position vector r. Assuming that the 
force is constant, you can define its work, W, as the scalar product of the force and 
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the position vector: W = F*r. You know how to compute this dot product, actually 
you know even two different equivalent ways of doing so. The dot product can be 
computed either as 


W = |F| \r\ cosO (2.7) 

where \F\ or \r\ are magnitudes of the force and of the position vector, while 0 is 
the angle between these two vectors, or as 

W = F x x + F y y + F z z (2.8) 

where F x ,y,z are components of the force along X , Y , and Z axes of a specified 
coordinate system correspondingly, while x, y, and z are corresponding components 
of the position vector (or coordinates, if you wish). The magnitudes of the force 
and position vectors can be defined via the scalar products of the vectors with 
themselves, e.g., \F\ = VF*F. The magnitudes of the vectors and their dot products 
possess certain important properties expressed by inequalities listed below: 

|A|>0 (2.9) 

\A-B\ < \A\ \B\ (2.10) 

\A+B\ < \A\ + \B\. (2.11) 

The first of these inequalities as well as the statement that the equality in it is only 
reached for a null vector is quite trivial for regular vectors. Equation 2.10 follows 
from the limitation on the values of the cosine function, cosO < 1, and the last of 
these inequalities expresses a well-known geometrical fact that the sum of any two 
sides of a triangle is always larger than the third side. The equality in this case can 
only be reached for degenerate triangles, in which all its sides are aligned along a 
single line. 

Since the magnitude and the dot product play such an important role in 
application of regular vectors, you would be correct to think that it is a clever idea 
to introduce the similar operation for the abstract vectors as well. The problem is 
that since I do not know what my abstract vectors actually are, I cannot give the 
magnitude and the dot product an operational definition (meaning a prescription 
how to compute them similar to Eq. 2.7 or 2.8). What I can and will do is to 
postulate that these operations exist and define them by requiring that whatever they 
are operationally, they must have the same properties as those given in Eqs. 2.9- 
2.11. When talking about abstract vectors, however, it is customary to use the term 
“norm” instead of the magnitude and “inner product” instead of the dot product. 
Notation usually used for the norm of an abstract vector |a) is ||a||, so that Eq. 2.9 
takes the form of 


INI >o 


(2.12) 
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which obviously implies that the norm is necessarily real-valued (as opposed to 
complex-valued). However, an attempt at defining the norm as an inner product of 
the vector with itself, as well as defining the inner product by simply extending 
Eq. 2.8 to an arbitrary number of components, results in a problem. 

It turns out that unlike the case of regular three-dimensional vectors, in general it 
is not possible to define the inner product using only vectors belonging to the same 
vector space. To see why this is so, consider an example of single-column matrices, 
k , with N rows (1 xiV matrix or a column vector) with complex-valued elements. 
Multiplication by a complex number a>k in this case is defined obviously as 


k\ 


ak\ 

k 2 


ak 2 

kN -1 


okN—i 

_ kN _ 


ak N 


(2.13) 


and produces another column vector. The addition of two column vectors k+p is 
defined as 



k\ 


Pi 


h+pi 


k 2 


Pi 


k2+P2 

k +p = 

kN- 1 

+ 

Pn- i 

— 

k \-1 + Pn -i 


- k N _ 


- Pn - 


_ kN + Pn - 


(2.14) 


and also produces a column vector. Obviously, column vectors form a linear space. 
Now, in the regular matrix algebra, you can define two types of products, inner 
product and outer product, but neither of them can be introduced using only column 
vectors. To introduce either of these two operations, you need to combine a column 
vector with a row vector (matrix iVx 1). Now, if you place the row vector to the 
left of the column vector and use regular matrix multiplication rules (row by the 
column ), you can convert the two vector objects into a number: 


[k\ &2 * * • k N -1 k N ] 


Pi 

Pi 


Pn- i 

- Pn - 



(2.15) 


This is the inner product of the row and the column vectors. If you swap the two 
objects placing the column to the left of the row, you can generate aiVxiV matrix 
using the following rule: 
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Pi 

P2 

[k\ k 2 ■ 

II 

T 

P\k\ p\k 2 ■ 
Pih p 2 k 2 ■ 

■ ■ P\kw 

■ ■ Pik N 

Pn- i 

- Pn - 



_p N k\ p N k 2 ■ 

• ■ PNk,\<_ 


What you get here is the so-called outer or tensor product of the row and the column 
vectors. 

The row vectors do not belong to the same vector space as column vectors (you 
cannot add a column and a row): they form their own space called adjoint space. In 
order to establish a proper relationship between the row space and the column space, 
we need to determine how to convert a column vector into a row vector. It appears 
that the answer is almost trivial—one can do it using matrix operation known as 
transposition, which transforms a column vector k into a respective row vector k T . 
However, it is easy to see that by using simple transposition, you will not be able 
to generate the inner product and the norm satisfying the conditions presented in 
Eq. 2.12. Indeed, generalizing Eq. 2.9, you can try introducing the norm using an 
inner product of a row obtained by transposition of the initial column and the column 
itself. This procedure will yield 


k\ 

k 2 


k T - k = [ki k 2 • • • ktf-i 


kN- i 

- k N _ 



(2.17) 


If kt are complex-valued quantities, the result of this multiplication is not necessarily 
real-valued in clear contradiction with the required property of the norm. The 
problem, however, can be fixed if in addition to transposing a column vector, 
you would also complex conjugate its elements. The resulting operation is called 
Hermitian transposition or Hermitian conjugation, and it turns a column vector k 
into its adjoint or Hermitian conjugate row vector k f . Using Hermitian conjugation 
rather than simple transposition turns Eq. 2.17 into 


k^ - k = [k* k* •••£*_! **] 


k\ 

k 2 


k N ~\ 

- k N _ 



(2.18) 


where \ki\ means the absolute value of the respective complex number. Now you 
can define the inner product of two column vectors, k and p (j k , p ), as an operation 
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which involves Hermitian conjugation of a vector on the left and forming an inner 
product of the resulting row with the parent column: (k,p) = k^p. It is obvious that 
the inner product defined this way is not commutative, meaning that (A:, p) ^ (p,k ). 
Actually, it is easy to see that (A:,/?) = (p, A:)*. Using Hermitian conjugation one 
can also define a new kind of the outer product as well, but I will leave the discussion 
of the latter till later. 

A linear vector space with a defined inner product and a norm becomes 
something which mathematicians call Hilbert space. (There are some mathematical 
niceties and details concerning the exact definition of the Hilbert space, but they 
are of no concern to us here.) I hope that the example with column and row vectors 
helps you to realize that in order to define the norm and the inner product for our 
abstract vectors \a ), you need complimentary vectors inhabiting an adjoint space. 
Again following Dirac I will designate a vector adjoint to \ft) as (/? | and use notation 
[P\a) for the inner product. In the case of abstract vectors, we do not really know 
how to actually compute the inner product, but whatever its operational definition 
might be, we require that it obeys the following condition: 

{P\a) = ((a\p))* . (2.19) 

This property ensures that the norm defined as 

mi = V(4*> ( 2 - 2 °) 

is real and nonnegative. Indeed, applying Eq. 2.20 to the case when |/3) = \a), you 
have (a| a) = ((a| o'))*, proving that (o'| o') is real-valued. 

To distinguish between adjoint vectors in speech, we call \a) a ket vector and 
(o'|—a bra vector. These terms have been introduced by Paul Dirac together with 
the respective notation. Their origin can be traced to word “bracket” split in two 
halves “bra - ket” just like angular brackets in ( a\ a) are split in two vectors (what 
happened to letter “c” in the process is anybody’s guess). 

So far, we have no operational prescription on converting a single generic ket into 
a respective bra: all what you need to do is just to change the orientation and position 
of the angular bracket from |) to (|. However, an important question which we need 
to figure out now is how to do this conversion in the case of such expressions as 
a |o') or even more complex expressions of the kind of Eq. 2.2. What is clear is that 
the adjoint of this expression must look like 

(i a |a))^ =7l (a \ , 

where I used symbol t to designate conversion to the adjoint space (or performing 
Hermitian conjugation just like in Eq. 2.18). Now I need to find how coefficients 2T 
are related to a. To this end let me compute the norm || act ||: 
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For this expression to be real-valued for all possible coefficients a , I have no choice 
but to require that rT = a*. This yields a simple rule for Hermitian conjugation 
of complex expressions involving ket vectors: convert all kets into bras by simply 
replacing one angular bracket with the other and complex conjugate all numerical 
coefficients appearing in the original expression. Here are a few examples of 
application of this rule: 

Example 1 (Hermitian Conjugation) Perform Hermitian conjugation of the follow¬ 
ing expression: 


|tf) — (2/ + 3) \q\) + 5i \q 2 ) • 


Solution 


(aI = (-2/ + 3) (q x \ - 5i {q 2 \ . 

Example 2 (Norm Calculation) If ||#i|| = 2, ||# 2 || = 4, and (q\ \ q 2 ) = i — 1, find 
the norm of |a) defined in the previous example. 

Solution 


ll«H 2 = ((-2i + 3) (</i | — 5i (# 2 |) ((2/ + 3) |#i) + 5i\q 2 )) = 

(—2 i + 3) (2 i + 3) ||^i || 2 + Si {—Si) ||^2|| 2 + 

5i (-2 i + 3) (q\\ q 2 ) - Si (2 i + 3) (q 2 \ q\) = 

13 x 4 + 25 x 16 + (10 + 15i) O' - 1) + (10 - 150 M - 1) = 402 

||a;|| = V402. 

It will be shown in the future development of the formalism that vectors |a) and 
a | a), where a is an arbitrary complex coefficient, describe the same quantum state, 
i.e., contain the same information about the system. Therefore, it is often convenient 
to deal only with the states whose norm is equal to unity (in a way such states are 
analogous to unit vectors in a regular three-dimensional space). Such vectors can 
be produced via normalization procedure, which consist in replacing an original 
vector |o') with the vector | a) / ||of||. In the future, we shall assume that all abstract 
vectors used in calculations are normalized, and those which are not will have to be 
normalized before any calculations with them are performed. 

Example 3 (Normalization) Normalize the following state: 

l«) = 2|<?i> + 3i\q 2 ) - ^ |<? 3 ), 



26 


2 Quantum States 


assuming that the norm of all vectors \qt) is unity and that inner products involving 
pairs of different vectors are all equal to zero. 

Solution 

1st step: Hermitian conjugation 


2 

(a I =2(<?i|—3i {qi\--(qi\. 
2nd step: Forming the inner product 

||a|| 2 = (a| a) = (l {q x \- ?>i {q 2 \ - ^ 

4 

4 + 9+- 

3rd step: The normalization 


_ 11 

” y 

l?2 ) ~ J |?3)J • 

I will complete this discussion of vector states in their ket and bra reincarnations 
by considering another important example of vector spaces and respective inner 
products and norms. Consider a class of complex functions, \fr (x), of a single 
variable x defined over domain x e [— 00 , 00 ], which also satisfy the condition 
that \\jr(x)\ 2 dx < 00 . It is very easy to show that linear combinations of such 
functions also belong to the same class, so they do form a linear vector space. The 
inner product of two functions \/f(x) and (p{x) defined as 

/ oo 

\jf* (x)cp(x)dx (2.21) 

-00 

satisfies condition presented by Eq. 2.19, and the respective norm 


IIVi = y/ 

is obviously real-valued and nonnegative. Thus, these functions, called square- 
integrable, do form a Hilbert vector space, in which for each ket vector la) = ^(x), 
there is an adjoint bra vector (a | = i/r* (x) with an inner product and a norm defined 
as respective integrals. 



\*)N — yy + ^ 


telj (21^) + 3/1^2) -1 l^ 3 >j = 
__ 121 

““ ~9~ 
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So, I gave you two very different concrete realizations of abstract Hilbert space: 
column vectors and square-integrable functions with two different operational 
definitions of the inner product. Despite the difference in the operational meaning 
of inner product in these two cases, they all had the same defining properties. 


2.2.3 Superposition Principle and Probabilities 

States characterized by definite values of the complete set of mutually consistent 
observables have an important mathematical property, which I cannot yet prove (it 
will be done later), but it is important for the argument I am trying to present, so 
you will have to trust me on this for now. So, here is the property: 



) = 8 l ,k8n, m ---8 r , p , (2.22) 


where I assumed that the state vectors are normalized (<5/^ is Kronecker delta, which 
is equal to unity for coinciding indexes, and zero otherwise). This equation surely 
looks rather mysterious, but its actual content is rather simple. To see this imagine 
that you are dealing with a system described by a single observable, say, energy. To 
make the matter even simpler, assume also that the energy can only take two values 
0 and 1. Then you are dealing with only two states with definite values of energy 11) 
and |0). Equation 2.22 in this case simply states that 


(0 11 ) = o , (0 | 0 ) = (1 | 1 ) = 1 . 


Abstract vectors, whose inner product is equal to zero, are called orthogonal in 
the obvious generalization of the concept of orthogonality for regular perpendicular 
three-dimensional vectors, whose dot product is equal to zero. What I am driving 
at here is the connection between the mathematical concept of orthogonality and 
the physical notion of mutual exclusivity discussed in Sect. 2.1: Eq. 2.22 does say 
that mutually exclusive states are also orthogonal. If the basis is constructed of 
mutually orthogonal and normalized vectors, it is called orthonormalized. Such 
bases are particularly convenient to work with, and, therefore, they are used almost 
exclusively in practical calculations. Here is an example of a system of vectors 
forming an orthonormal basis, which you are quite familiar with. 

Example 4 (Fourier Series: Examples of a Discrete Orthonormal Basis) Consider 
a set of functions of a single variable v defined on a finite interval [0,L]. These 
functions obviously form a linear space with an inner product defined as 


L 


(/I g) = Jf*(x)g(x)dx. 


0 
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It is well known that these functions can be expended into a Fourier series 1 

IT oo 

/(*)= yj l Y J a n e ‘ 2nnx/L 

—OO 


with expansion coefficients given by 

_ L 

an = v z / f (x ^ e ~ 2mnx/Ldx - 
0 

One can identify \/ n ) = J^e l2nnx ^ L with vectors of an orthonormal basis so that 
the Fourier series expansion of the function can be presented in the form of Eq. 2.6. 
Indeed, it is easy to see that 


{Xm | Xn ) 


— f dx€~ 2 ^ 7tmX /^‘€ 2 ^ 7tnX /^ 

Lj 

o 


1 m = n 
0 m ^ n 


Mutual exclusivity can be more formally expressed in terms of probabilities: if a 
quantum system is in a state with prescribed values q ^, q P, • • • c$ max) of a complete 
set of mutually consistent observables, then the measurement of the observables q^ 
will produce value q p appearing in the definition of the state with a probability 
equal to 1, while the probability of any other value is equal to zero. Now, let’s 
turn our attention to the superposition state of the type presented in Eq. 2.2 and ask 
a question: what should we expect if we measure observable q and the system is 
in the state given by this equation? You already know that the exact result of the 
measurement cannot be predicted, but it is intuitively clear that it must be either q\ 
or q 2 . The only question you need to ponder on now is that if the measurement is 
repeated multiple times with the system always brought back to the same initial state 
(or if, instead of one system, you got your hands on an ensemble of identical systems 
all in the same state), what are the fractions of measurement outcomes yielding q\ 
and < 72 ? It seems reasonable to assume that it is coefficients a\^ in the superposition 
which will determine an answer to this question. Indeed, if I set either a\ or to 
zero, I will take you back to the state in which all measurements yield either q\ov 
q 2 , respectively, while its counterpart is never observed. It is natural to describe this 
situation in terms of probabilities and assume that these probabilities are determined 


1 There are some conditions on the required smoothness of the functions for their Fourier series 

actually to represent them accurately, but leave it to mathematicians to worry about these details. 
Here I just want to mention that the representation as a Fourier series actually works even for 
functions, which are not necessarily continuous. 
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by the coefficients a t . However, it is clear that I cannot identify the coefficients 
themselves with the probabilities because ^ are not necessarily positive or even real 
(if this were not the case, we could not describe both constructive interference and 
destructive interference just like in the case of electromagnetic waves). At the same 
time, you can recall that in the case of wave interference, it is not the amplitudes 
of the waves but their intensities proportional to the absolute values of the squared 
amplitudes that determine the brightness of the interference fringes. Thus, we can 
surmise that in the case of quantum superposition, respective probabilities are given 
by |a,-| 2 : 


p(q i ) = \a i \ 2 . (2.23) 

Multiplying Eq. 2.2 by (q\ | or | (the bra counterparts of respective vectors | qfi) 
from the left and using the orthogonality condition, Eq. 2.22, I can derive for the 
coefficients a t 

(<?i|«) = (<7il («i ki) + a 2 \q 2 )) = «i |<7i|[ 3 «i = (qi\a) 

(<? 2 1 a) = {q 2 \ ( a\ |^i) + a 2 \qi)) = «i 1^21| 2 => a 2 = {q 2 \a), (2.24) 

where I took into account the convention that all vectors describing quantum states 
are presumed to be normalized. Expressions derived in Eq. 2.24 allow presenting 
Eq.2 .23 for probability in a more generic form: 

p(qd = |(<7;l“)| 2 - (2-25) 

Applying Eq. 2.25 to the case of |a) = | qj), you find p (qi) = Sij establishing 
formal correspondence between notions of mutual exclusivity and orthogonality. 
Computation of the norm of the state \a) yields 

II II 2 I I 2 i I I 2 

|| a || = \a i| + | <z 2 1 = Pi + P 2 - 

If |a | = 1, i.e., the state \a) is normalized as presumed, then you obtain relation 
pi + p 2 = 1 in complete agreement with what is expected of the probabilities. This 
result reinforces my (well, actually Max Born’s) suggestion to interpret \{q t \ a)\ 2 
as a probability that the measurement of observable q on a system in state \a) will 
produce q t . 

It is important to emphasize that any uncertainty in the result of the measurement 
of the observable q exists only before the measurement has taken place and that the 
probability referred to in this discussion describes the number of given outcomes in 
the series of such measurements. After the measurement is carried out, and one of 
the values of the observable is actually observed, all uncertainty has disappeared. 
We now know that the measurement yielded a particular value q t , which, according 
to our earlier proposition, is only possible if the system is in the respective state 
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| qi). Thus we have to conclude that the act of measurement has destroyed the initial 
state |o') and “collapsed” it into state \qi). The most intriguing question, of course, 
is what determines the state in which the system collapses into. This question has 
been debated during the entire 100+ year-long history of quantum mechanics and is 
being debated still. The orthodox Copenhagen interpretation of quantum mechanics 
essentially leaves this question without an answer claiming that the choice of the 
final (after measurement) state is completely random. 2 I propose that you accept 
this interpretation as quite sufficient for most practical purposes, while it does leave 
people with a philosophical state of mind somewhat unsatisfied. 

Equation 2.2 describes superposition of only two states. It is not too difficult 
to imagine that it can be extended to the case of the arbitrary number of states 
q^\ qin*, • • • qp Nmwc ^ generated by mutually consistent observables with discrete 
spectrum: 


a) = 


E 

k,m ••• ,p 


&k,r 


4k'> 




(Nmax) 


>• 


(2.26) 


This sum, in principle, can contain any number of terms, including an infinite 
amount. In the latter case, of course, one have to start worrying about its con¬ 
vergence, but I will leave these worries to mathematicians. Coefficients a^ m ..^ p 
appearing in this equation have the same meaning as the coefficients in the two-state 
superposition expressed by Eq. 2.25, in which | q t ) is replaced with a more general 
state appearing in Eq. 2.26. 


2 At a talk given at the physics department of Queens College in New York in 2014, British 
mathematician J.H. Conway (currently Professor Emeritus of Mathematics at Princeton University) 
dismissed the randomness postulate of the Copenhagen interpretation as a “cop-out” and also 
because the use of probabilities only makes sense when one deals with a well-defined ensemble 
of events or particles, which is not true in the case of a single electron or photon. At the same 
time, he and S. Kochen (Canada) proved a mathematical theorem asserting that the entire structure 
of quantum mechanics is inconsistent with the idea of existence of some unknown characteristics 
of quantum systems, which would, shall we find them, provide deterministic description of the 
system. In this sense, they proved completeness of the existing structure of quantum theory 
and buried the idea of “hidden variables”—unknown elements of reality, which could restore 
determinism to the quantum world—provided that we are unwilling to throw away the entire 
conceptual structure of quantum mechanics, which, so far, gave excellent quantitative explanation 
of a vast number of the experimental data. The Conway and Kochen theorem is called “free will 
theorem” because it can be interpreted as an assertion that electrons, just like humans, have “free 
will,” which in strict mathematical sense means that electron’s future behavior might not be a 
deterministic function of its past. The description of the theorem can be found here: https://en. 
wikipedia.org/wiki/Free_will_theorem. 
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2.3 States Characterized by Observables with Continuous 
Spectrum 

In the previous section, I considered only states generated by observables with 
discrete spectrum. As a result, even though the number of states in Eq. 2.26 can 
be infinite, they are still countable (one can enumerate them using natural numbers 
1,2, 3, • • •). Some observables, however, have continuous spectrum, meaning that 
they can take values from a continuous (finite or infinite) interval of values. One of 
such important observables is a particle’s position, measured by its position vector r 
or a set of Cartesian coordinates (x,y, z), defined in a particular coordinate system. 
It is interesting to note in this regard that while in classical mechanics, descriptions 
using Cartesian coordinates are largely equivalent to those relying on spherical or 
polar coordinates, it is not so in quantum description, where angular coordinates 
in spherical or cylindrical systems do not easily submit to quantum treatment. 
This comment obviously appears somewhat cryptic here, but its meaning will be 
clarified in the subsequent chapters. Another peculiarity of the position observable 
is the need to carefully distinguish between coordinates being characteristics of a 
particle’s position and coordinates being markers of various points in space, needed 
to describe position dependence of various mathematical and physical quantities. 

Other observables, such as energy or momentum, might have either continuous or 
discrete spectrum depending upon the environment, in which a particle finds itself, 
or might have mixed spectrum, where an interval of discretely defined values crosses 
over into an interval of continuously distributed values. 

Two main peculiarities of states characterized by observables with continuous 
spectrum are that (1) they cannot be normalized in a regular sense of the word and 
(2) the concept of probability as defined by Eq. 2.25 loses its meaning because in 
the case of continuous random variables, the probability can only be defined for an 
interval (which might be infinitesimally small) of values, but not for any particular 
value of the variable. These two features are not independent and are related to each 
other as it will be seen from the future analysis. 

I will illustrate the properties of states corresponding to observables with 
continuous spectrum using the position of a particle as an example. Assuming that 
there are no other observables mutually consistent with the position, I will present 
a state in which this observable has a definite value r as \r ). In order to construct 
a superposition state using these vectors, I have to replace the sum in Eq. 2.26 
with an integral over all possible values of position vector r introducing instead 
of coefficients with discrete indexes a function xj/ (r) of a continuous variable: 



(2.27) 


Now I want to compute the norm \\a\\ of the superposition state given by Eq. 2.27. 
The Hermitian conjugation of Eq. 2.27 produces the respective bra vector 



(2.28) 
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so that the norm becomes 



(2.29) 


where I had to rename the integration variables in order to be able to replace the 
product of integrals with a double integral (note that r\ appears in those parts of 
Eq. 2.29, which originate from the bra vector of Eq. 2.28, and ri appears in the ket- 
related parts of the integral). States |ri) and \ri) remain mutually exclusive even in 
the case of continuous spectrum as long as r\± r 2 - Thus I can write based on the 
discussion in the previous sections that {r\ \ r 2 ) = 0 for r\ ^ 7 * 2 . If I now require 
that (r\ | 7 * 2 ) = 1 for r\ = 7 * 2 , which would correspond to the “regular” normalization 
condition, I will end up with an integral, in which the integrand is zero everywhere 
with exception of one point, where it is finite. Clearly such an integral would be 
zero in contradiction with the properties of the norm (it can only be zero for null- 
vector). To save the situation, something has to give, and I have to reject one of 
the assumptions made when evaluating Eq. 2.29. The mutual exclusivity of |ri) and 
1 7 * 2 ) and the related requirement that (ri| ri) = 0 for r\ ^ 7*2 are connected with 
the basic ideas discussed in Sect. 2.2.3 and, therefore, appears untouchable. So, the 
only choice left to me is to reject the assumption that (r\ \ rf) is equal to unity or 
takes any other finite value. As a result we are left with the following requirements 
on (ri| #* 2 ): this expression must be zero for unequal values of its arguments while 
producing a non-zero result when being integrated with any “normal” function. 

These requirements are satisfied by an object called Dirac’s delta-function. Dirac 
introduced notation 8 (v) for its simplest single-variable version and presented most 
of its properties in the form useful for physicists in his influential 1930 book 
The Principles of Quantum Mechanics , which since then has been reissued many 
times (it is the same Paul Dirac who introduced the bra-ket notation for quantum 
states). It seems a bit unfair to name this object after him because it was already 
known to such mathematicians as Poisson and Fourier in the nineteenth century, but 
physicists learned about it from Dirac, so we stick to our guns and call it a Dirac’s 
function. The first thing one needs to understand about the delta-function is that 
it is not a function in any reasonable sense of the word. Therefore, the meaning 
of such operations as integration or differentiation involving this object cannot be 
defined following the standard rules of regular calculus. Nevertheless, physicists 
keep working with this object as though nothing is wrong (giving nightmares to 
rigor-sensitive mathematicians) with the only requirements that the results of all 
performed operations must make sense (that is from a physicist’s perspective). 
Mathematicians call such objects “distributions” or treat them as examples of 
“functionals.” Below I supply you with all properties of the delta-functions you will 
need to know. 

The main defining property of the delta-function of a single variable is 



/(o), 0e[x u x 2 \ 
0 0 ^ [xi, x 2 \ 


(2.30) 
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with its immediate generalization to 


J f(x)8(x — xo)dx 


,/(*o), x 0 e [x u x 2 ] 

0 xo^[xi,x 2 ]. 


(2.31) 


These equations express the main property of the delta-function—it acts as a selector 
singling out the value of function/(v) at x = v 0 , where the argument of the delta- 
function vanishes. In a particular case of f(x) = 1, Eq. 2.31 yields another important 
characteristics of the delta-function: 



1 , x 0 e[xi,x 2 ] 

0 xo£[xi,x 2 ], 


which expresses the idea that while the “width” of the delta-function is zero and 
its “height” is infinite, the area covered by it is equal to unity. An example of actual 
limiting procedure producing a delta-function out of a regular function based on this 
idea can be found in the exercises in this chapter. 

One can also define the delta-function of a more complex argument such as 
8 [g(v)], where g(v) is an arbitrary function. If g(v) has only one zero at x = xo, 
I can define 8 [g(v)] by replacing g(x) with the first term of its Taylor expansion 
around xo m . g(x) ^ b(x — Jto), where b = (dg/dx) x=XQ , and making a substitution of 
variable 'x = b(x — x o), which yields 

*2 gfa) 



(2.32) 


The expansion of g(x) in the Taylor series is justified here because the value of the 
integral is determined by the behavior of this function in the immediate vicinity 
of Vo- 

If the function g(v) has multiple zeroes within the interval of integration, then we 
must isolate each zero and perform the procedure described above for each of them. 
The result will look something like this: 



(2.33) 


where bi is the value of the derivative of g(x) at the respective i- th zero Xq\ To 
illustrate this procedure, consider an example. 
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Example 5 (Delta-function With Two Zeros) Consider 


g{x) =x 2 -xl. 


In this case the method outlined above yields 


*2 

/ 


f(x)S [x 2 -x 2 0 \dx= — 

ZXo 


X2 X2 

j f(x)8 (x - xq) dx+ j f(x)8 (x + vo) dx 


(2.34) 

where I assumed that both xq and —v 0 belong to the interval between x\ and v 2 .1 can 
also define a derivative of the delta-function using integration by parts and assuming 
that integral of df /dx is still equal to f(x) even if f(x) = <5 (v). This is how it goes: 


x 2 X 2 

Jf(x)8'(x — xo)dx = f(x)8(x — x 0 )\ x f x — j 8(x — x 0 )f'(x)dx 

X\ XI 

{— % ’ X 0 ^ [vi,v 2 ] 

= ° x=xq 

(o x 0 £[xi,x 2 ]. 


(2.35) 


Similarly one can define higher derivatives of the delta-function. 

We will also need an important representation of the delta-function as a Fourier 
transform: 


oo 

= h I 


e itx dk. 


(2.36) 


To demonstrate that this representation of the delta-function actually makes sense, 
consider direct and inverse Fourier transforms: 


f(x) = -2= J f{k)e ikx dk 


— OO 
OO 


\fl7t 

Substituting Eq. 2.38 into Eq. 2.37, 1 get 


m = -E J f{x)e~ ikx dx. 


(2.37) 


(2.38) 


OO OO OO 

fix) = 2- j fix\)e~ ikxi e ihc dkdx\ = ff dx t fix t ) j e ik(x ~ xi) dk 
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and the only way to make this into an identity for any function is to accept Eq. 2.36 
for the integral over k. 

Finally, you will need a generalization of the delta-function to the case of 
several variables. For instance, delta-function involving position vectors in Cartesian 
coordinates can be defined as 


8 (>i - r 2 ) = 8 (*i - x 2 ) 8 Oi - y 2 ) 8 (z.\ - z 2 ), (2.39) 


in which case its representation in the form of a Fourier transform becomes 


5 (ri 




e ik x (xi-x 2 ) e ik y (yi-y 2 ) jkM-zi) dk dk dk = 


1 

(2^f 


/ 

—OO 


e iHn -n) d 3 k 


(2.40) 


Now back to the calculation of the norm, i.e., to Eq. 2.29. To complete this 
calculation, I will introduce a generalized, so-called delta-function normalization 
condition for states | r) by requiring that 


(n| r 2 ) = 8 (r, — r 2 ). (2.41) 

Substituting Eq. 2.41 into Eq. 2.29 and using the properties of the delta-function, I 
finally arrive at 


ni 2 = I 


d 3 r\l/* (r) \j/ ( r ) . 


(2.42) 


Now, in order to ensure correct normalization of the state \a), you only need to 
require that function xj/ (r) is chosen to be normalized such that 

Jd\\^(r)\ 2 = 1. (2.43) 

Example 6 (Normalization of a Wave Function) Normalize the state la) presented 
by the following function. 


VK*) = e ikx e-^ 12 . 
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Solution 

Using the definition of the norm, Eq. 2.42, 1 have 


CXJ CXJ 

Ml 2 = J f*(x)f(x)dx= J 


dx = 



where I used the substitution of variables y = *Jax and the well-known integral 


oo 

J exp (—y 2 ) dy = . 

—oo 

Thus, the normalized form of the state can be written as 

oo 

\a) = (^) V4 J e ikx e ~ ax2/2 \x). 

—oo 

Function xfr (r) in these expressions is called the wave function and is often cited 
in quantum mechanics textbooks as the descriptor of a quantum state. You can see 
now that this is not quite the case—the definition of the wave function involves 
two different states: the actual state of the system \ot) and a state, |r), in which a 
particle would have a definite position. You can think of the wave function as a 
projection of |cu) on | r). If the position r could only take discrete values, we would 
have interpreted (r)\ 2 as a probability. In the continuous case, however, we can 
only ask about a probability that the measurement of position would produce a result 
within a certain (possibly infinitesimally small) volume around some central point r. 
The answer to this question is well expressed in terms of differential probability 

dP ( r ) = d 3 r \\jr (r)| 2 , 

where (r) | 2 can be interpreted as the position probability density. The probability 
that the measured position vector belongs to some finite volume V is given by 
expression 


P(V)= [JJ d 3 r\i,(r)\ 2 , (2.44) 

V 

while the normalization condition 2.43 simply states the fact that the measurement 
of the particle’s position will produce some value within the entire volume available 
to the particle with probability equal to one. 
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Problem 1 Consider two states: 

m = | 0 i ) + /| 02 >- 2 | 0 3 > 

Wi) = — I0i) + 2 102) — « 103) , 

where |0i,2,3) are all normalized and orthogonal to each other. 

1. Normalize states \xj/\) and \ir 2 ). 

2. Find adjoint counterparts of these states (i/q | and (^l* 

3. Compute inner products (\j/\ | ^2) and (^2 1 Vo) and verify that (Vo| ^2) — 

(^21 ^i)*- 

4. Find a linear combination of states |t/o) and 1^2) that would be orthogonal to 

I’M- 

5. Compute + (^21) (IVo) + IV^))- Do it in two ways: (a) by computing the 
sums first and then taking the inner product and (b) using the distributive property 
of the inner product, remove the parenthesis and compute the inner products of 
the resulting individual terms. 

Problem 2 Determine if the following sets of vectors, defined by their components 
in some basis, are linearly dependent or independent: 

1. (2,2,0), (1,0,1), (0, f, —1) 

2. (0,0,1), (i, 0,0), (0,0,-1) 

3. ( 1 , f,2), ( 1 , /, — 1 ), (/,-/, 2i) 

Problem 3 Consider the set of functions 


. nnx 

f n (x) = Asm—, 


where L is a positive quantity. 

1. Prove that these functions form an orthogonal system with the inner product 
defined as 



2. Normalize these functions. 

3. Find an expression for coefficients c n in the expansion 

00 

fix) = ^2 c nfn(x), 

n =0 
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where f n (x) is given by the expression from the first part of the problem with 
amplitude A replaced by the normalization coefficient found in Part 2. 

Problem 4 Repeat problem 3 with the set of functions 

<Pn(x) = exp > 

and the inner product defined as 

L 

(' <Pn\ Vm) = j (P*i X ) ( Pm(x)dx. 

0 

Problem 5 Consider functions gi(x) = x, g 2 (x) = x 2 , and g 3 (x) = x 3 defined on 

an interval x E [—1,1] with inner product defined as 

l 

[gn\ gm > = J dxg n (x)g m (x). 

-1 

1. Which of these three functions are mutually orthogonal, and which are not? 

2. Consider linear combination of functions gi(v) and g 3 (v): ag i(x) + bg^ix) and 
find coefficients a and b which would make this function orthogonal to gi(v). 

3. Find a different linear combination of the same functions, which would be 
orthogonal to g 3 (x). 

4. Are these two new functions orthogonal to each other? 

Problem 6 Consider a wave function of the form 

xlf{x)=Ae ikx xe~ x2/2 . 

1. Normalize this function using the standard definition of the inner product for the 
square-integrable functions. 

2. Find the probability that a measurement of the v-coordinate of the particle will 
produce a value between 0 < v < \fl. 

3. Find the probability that a measurement of the v-coordinate of the particle will 
produce a value such that v > 2. Use mathematical tables or any available 
computational tools to obtain the numerical values. 

Problem 7 Consider a function of the form 

/w Ji W<A/2 

(0 |*| > A/2. 

Show that this function turns into a Dirac’s 5-function in the limit A —> 0. 
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Problem 8 Compute the following integrals: 

1 . 


l 

/ 


+ 5x) 8 (x — 2) dx 


2 . 


J (sin 2x + 2 tan 3x) <5 (x 2 — 5x + 4) dx 


3. 


oo 

/ 


xe * <5' (x + 5) dx 


Problem 9 Evaluate the following expression: 

oo oo 

I dxfix ) / dkke ik(x ~ x,) . 

—OO —OO 


Hint: Use the representation of the delta-function as a Fourier integral to figure out 
the integral with respect to k. 



Chapter 3 

Observables and Operators 


Check for 
updates 


3.1 Hamiltonian Formulation of Classical Mechanics 

The version of classical mechanics based on forces and Newton’s laws resists any 
meaningful reformation into a quantum theory because it depends critically on such 
concepts (trajectory, acceleration, etc.) that do not correspond to any observable 
reality in the quantum world. More productive for finding links between classical 
and quantum realms is an alternative formulation, where energy rather than force 
takes the central role. There are two essential elements in this formulation of 
classical mechanics. One is the idea of canonical coordinates in the so-called phase 
space (as opposed to regular three-dimensional configuration space), and the other 
is the concept of Hamiltonian. 

Points in the phase space represent classical states of the system, characterized, 
for instance, by its coordinates x t and components of the momentum vector p t . For 
a single particle moving along a straight line (one-dimensional motion), the phase 
space is two-dimensional; for the fully three-dimensional motion, the phase space is 
six-dimensional; and for a three-dimensional motion of N particles, the dimension 
of the phase space is 6N. Each point in the phase space represents the most complete 
information about a classical system—its coordinates and velocities. When particles 
move, their coordinates and momentums change, drawing a phase trajectory of the 
system in the phase space. For a single particle allowed to move only along a 
straight line, this trajectory is a curve in two-dimensional space. If the motion of the 
particle is conservative, i.e., its energy is a conserving quantity, each phase trajectory 
is an equienergetic line—each point on the trajectory corresponds to the state of 
the system with exactly the same energy (energy does not change, while particles 
change their position and momentum). Using the phase space, we effectively put 
space coordinates and momentum of the particles on equal footing without imposing 
any a priori relationships between them (as opposed to elementary mechanics, when 
the momentum is defined via the time derivative of coordinates). You shall see that 
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the relationship between coordinate and momentum, called canonically conjugate 
variables, arising within this framework is much closer to its quantum version than 
it would have been in the Newtonian approach. 

The Hamiltonian is essentially the energy of a conservative system expressed in 
terms of coordinates and momentum H(p,r ), which in the case of a single particle 
takes the form of 


H(p,r) = 


2m 


V(r), 


(3.1) 


where p is the momentum vector 1 and V(r) is the potential energy of the particle in 
the external field. The Hamiltonian occupies a special place in classical mechanics 
(as compared, for instance, to angular momentum, which can also be a conserving 
quantity under certain circumstances) because it determines system’s dynamics via 
Hamiltonian equations, which can be formulated as 


dpi _ dH 
dt dr t 
dri _ dH 
dt dpi ’ 


(3.2) 

(3.3) 


where r t and pi (i = 1,2,3) are Cartesian components of the position and 
momentum vectors x, y, z and p x ,P y ,p Z ’ respectively. Hamiltonian equations can be 
rewritten in another interesting form using so-called Poisson brackets {/, g} defined 
for two arbitrary functions of canonical variables: 


j^Kfoidpi dpidrj ' 


(3.4) 


Summation in Eq. 3.4 is over all relevant canonical conjugated pairs of coordinates. 
It is easy to see that the Poisson brackets for momentum and corresponding 
coordinates are 


{ r i’Pj} ~ (3.5) 

This form of Poisson brackets is called canonical: any pair of variables possessing 
Poisson brackets of this form form a canonically conjugated pair and satisfy 
Hamiltonian equations 3.2 and 3.3. 

Applying the definition of the Poisson brackets, Eq. 3.4, to the pair of functions 
Pi, H and r t , H, you can find (check it out!) 


l p 2 is defined as usual as the square of the magnitude of the vector in Cartesian coordinates p 2 x + 

Py+Pl 
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9 # 

{p„H} = - — , 

OVi 

{ r i’H} = - , 

OPi 

so that Hamiltonian equations 3.2 and 3.3 can be rewritten even in a more symmetric 
form: 



(3.6) 


(3.7) 


Finally, the time derivative of an arbitrary function of canonical coordinates can 
be expressed in terms of Poisson brackets involving Hamiltonian. I illustrate this 
statement for a function of only one pair of coordinates f(x,p, t): 

df 9 f 9/ dp dfdx 9/ 9/9# 9/9# 9/ , 

Jt-Jt + Y P Jt + YxJt-Jt~Yp^ + Yx^-Jt +{f ' H} - (3 ' 8) 


3.2 Operators in Quantum Mechanics 
3.2.1 General Definitions 


The main task of quantum theory is to be able to predict (or explain) results of 
experiments conducted with quantum systems. All such experiments involve taking 
a system in some initial state, subjecting it to external influences, which change 
its environment, and observing a reaction of the system to these changes. A theo¬ 
retician in me would say that by doing all these manipulations and measurements, 
experimentalists change the quantum state of the system, but so far the formalism I 
have at my disposal does not have any theoretical representation of all these turning 
knobs and dials, lasers, which go on and off, magnets, thermostats, and all other real 
material objects in the arsenal of an experimentalist. I need additional mathematical 
tools, which would allow me to describe theoretically all these changes inflicted 
upon an unsuspecting system by the men in lab coats. Since the quantum states are 
presented in the theory by vectors of a linear vector space, what I need are objects 
that can change these vectors. Such objects are known to mathematicians—they 
call them operators. The role of operators in quantum theory is twofold. On one 
hand, they are used to describe transformations of state vectors, and on the other 
hand, they provide the theoretical means to predict the outcomes of measurements 
of observables. 
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From the mathematical standpoint, an operator is a rule prescribing how to 
change one abstract vector of a linear vector space, say, \a), into another abstract 
vector, say, |/3) of the same or a different vector space. Symbolically this can be 
represented as, 


\P) = T\a), (3.9) 

where the “hat” " above a capital letter (in this case T) signifies that T represents 
such a rule, or an operator, “acting” on \a) and converting it into |/3). Note that the 
symbol of the operator appears in Eq. 3.9 next to the vertical line marking the “tail” 
of the ket la). 

The special role in quantum mechanics and other applications is played by linear 
operators —the class of rules satisfying the following condition: 

T (<ai |ai) + a 2 |a 2 » = a x T laq) + a 2 T \ot 2 ) . (3.10) 

Here are a few examples of linear operators: 

1. Differentiation operator d/dx converting a function/ (x) into its derivative g(x) = 
( d/dx)f = df /dx (note how the operator symbol appears on the left of the 
function) 

2. Gradient operator V = e x d/dx + e y d/dy + e z d/dz, where e x ^ z are unit vectors 
in the directions of the respective coordinate axes, converting a scalar function of 
three spatial variables into a vector: 

v/ (x, y, Z ) = e x df/dx + e y d )// dy + e z df/ dz 

3. Integration operator, K , which is defined by its kernel K(x \, x 2 ) and converts one 
function to another as 

oo 

|g) = K I/) «=> g(xi) = J K(x 1 , x 2 )f (x 2 ) dx 2 

—OO 

4. Rotation operator R , which changes the orientation of a vector without changing 
its length 

Linearity of the first three operators is evident from linearity of differentiation and 
integration, and the proof of linearity of rotations is a simple exercise in geometry 
and is left to the readers to perform. 

Equation 3.9 defines an operator by its action on a ket vector. It is also possible 
to define an operator acting on bra vectors. One can, for instance, perform formal 
Hermitian conjugation of Eq. 3.9 and introduce Hermitian conjugate operator T^ \ 




(3.11) 
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Notice that now operator T ^ stands to the right of the respective bra vector but still 
next to its tail, “acting” to the left. Thus, Hermitian conjugation in this case involves 
also the change in the order, in which the participating objects are written, as well 
as the “direction” of “action” of the operators from right to left. 

In order to help you develop intuition regarding transition between Eqs. 3.9 
and 3.11, consider a linear space of column vectors—1 x N matrices. For operators 
you can take N x N matrices and define its action on a vector as regular matrix 
multiplication. For this definition to make sense from the point of view of matrix 
multiplication rules, the matrix must be placed to the left of the column vector. The 
result of this operation is another column vector: 


^11 hi ' ' 

■* t\N 


a\ 



ti\ hi *' 

‘ * hN 


a 2 

= 

b 2 

tm tNi * ■ 

' * tNN_ 




J?N_ 


(3.12) 


The Hermitian conjugate of a column vector is a row vector (iV x 1 matrix) with 
complex-conjugated elements. Equation 3.12 contains two column vectors, and 
its Hermitian conjugated version must describe the relation between two rows 
\a\ a* • • • a and [b* b* • • • b^]. However, in order to be able to multiply a square 
matrix and a row vector, I must place the former to the left of the latter: 




'4, 

^12 " 

" 4 

r * * 

K a 2 • 

‘ * u n\ 

4i 

42 

4a T 



Jni 

4/2 " 

•• r 1 

l NN, 


= [b\ b*--b*]. 


(3.13) 


where ft represents elements of a Hermitian conjugate operator matrix T f . If 1 want 
(and I certainly do) that the relation between elements a* and b* expressed by 
Eq. 3.13 reproduce complex-conjugated relations given by Eq. 3.12, 1 must require 
that the rows of the matrix in Eq. 3.12 coincide with the complex-conjugated 
columns of the matrix in Eq. 3.13: t[. = /)*. This gives me an operational (not 
just formal) rule for performing the Hermitian conjugation of the matrix operator: 
it consists in regular matrix transposition and complex conjugation of all matrix 
elements. This example serves two important purposes: first, it demonstrates 
why reversal of the order in which vectors and operators appear after Hermitian 
conjugation makes sense, and, second, it yields a rule for Hermitian conjugation of 
a matrix. 

In a general case, Eq. 3.11 does not give us any clue on how to actually generate 
Hermitian conjugate operators. In order to derive such a rule, I need to relate both 
(initial and Hermitian conjugate) operators to a quantity, which I know how to 
transform and which does not depend on any concrete realization of the vector space 










46 


3 Observables and Operators 


or an operator. The only quantity of this kind, which I know of, is an inner product, 
and I, in order to get to it, will multiply Eq. 3.9 by a bra vector (/3 | from the left. 
This will leave me with expression (f}\T\a) 9 which can be understood as a product 
of a bra vector (/3| and a ket vector T\a). Complex conjugating this expression and 
applying Eq. 2.19, 1 get 


{P\t\a)* = {a\t'\P), (3.14) 

where I also used Eq. 3.11 to convert ket T \a) into a corresponding bra (a\ . 

Equation 3.14 can be used to find a Hermitian conjugate of any particular operator 
as illustrated by the following examples. 

Example 7 (Hermitian Conjugation) Consider differentiation operator D acting on 
differentiable square-integrable functions as 


D\f) = 


dx ’ 


Using the definition of the inner product defined by Eq. 2.21, you can present the 
expression in Eq. 3.14 as 


oo 

{g\D\f)= I dxg*(x)f x 

—OO 

Integration by parts converts this expression into the following form: 

/ oo \ * oo 

(<g|£|/>)* = f dxg*(x )€ = g(x)f (x)\Zo~ I = 

\rOO / —oo 

oo 

- J 

—oo 

where I took into account that any square-integrable functions must vanish at both 
positive and negative infinities. Presenting this result in the form of the right side of 
Eq. 3.14 |g), you can identify as ZT = —d/dx. 

If an operator and its Hermitian conjugate coincide 

</5|7»* = <a|f|/5> (3.15) 

or T = T\ the respective operator is called Hermitian or self-adjoint operator. 
Hermitian operators have a number of important properties, which will be discussed 
in more detail in Sect. 3.3. Here I shall note just one important property of Hermitian 
operators, which trivially follows from Eq. 3.15: a quantity defined as (a\T\a) is a 
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real-valued number for any choice of state |a). Expressions of this type are called 
expectation values of the operator in a given state. The origin of this name will 
become clear in Sect. 3.3. A few examples of Hermitian operators follow below. 

Example 8 (Hermitian Operators) Let me prove that operator iD, where D is the 
differentiation operator, introduced in the previous example, is Hermitian. To this 
end I just need to repeat computations from Example 7: 

( oo \ * oo 

i J dxg* (x) C -jj- J = g(*)/*(*)|!° 00 + i j dxf*(x)f = 

—oo / —oo 

(/I iD |g) • 

Example 9 (Hermitian Operators) As a second example of the Hermitian operator, 
I consider a 3 x 3 matrix M acting on vectors in a three-dimensional vector space: 


M = 


1 i 2 
—i 1 4/ 

2 - 4 i 0 


I will demonstrate that this matrix is Hermitian by directly remembering that Her¬ 
mitian conjugation of matrices consists of transposition and complex conjugation. 
Consequently carrying out these operations, you can convince yourselves that they 
yield the same matrix M: 


1 i 2 
—i 1 4 i 

2 - 4 i 0 


1 -i 2 

1 1 —4/ 

2 M 0 


"1 i 2 " 
-i 1 4 i 
2 - 4 i 0 


You can also compute expression a^M a, where a is an arbitrary column vector 
and a t its Hermitian conjugate: 


1 i 2 

a\ 


a\ -)- ia2 + 2^3 

—i 1 M 

CL2 

= [a* a* a *] 

— ia\ + a 2 + 4/^3 

"o 1 1 2 

1 

<N 

_a$_ 


2(2 1 — 4/(22 


a\a\ + ia\a 2 + 2 a *^3 — ia*a\ + a*a 2 + Ma^a^ + 2a\a* — 4ia2a* = 

\a\ | + |$21 H - 1^31 H - 2 ( a*a 3 -f- a\af^ 4 - i (afa 2 — a*a\ 4$* a^ — 4^2^*) • 


It is obvious that the final expression is real-valued as promised. 
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3.2.2 Commutators, Functions of Operators, and Operator 

Identities 

In addition to Hermitian conjugation, you will need to perform on operators other, 
less exotic, operations, such as multiplication. The product of two operators T\ and 
r 2 is defined as consecutive action of the operators. If you consider action on a ket 
vector, the first operator to do the work is the one on the right: 

(V.) i«> = f 2 (f, i«}). 

In the case of operators acting on the bra vector, the order is opposite: the first to act 
is the leftmost operator: 


(<x\(t 2 T x ) = ({a\f 2 )f x . 

The most important property of the operator multiplication is actually the absence 
of a property: multiplication of operator is not, in general, commutative 2 : 


f 2 T\ ± f x f 2 . 


The non-commutative nature of operator multiplication is of extreme importance 
in quantum mechanics, and as you will see, it is the main mathematical feature 
responsible, for instance, for the uncertainty relation. For the same reason, sets 
of operators that do commute with each other also play an important role in the 
quantum formalism. 

The non-commutativity of operator multiplication is expressed quantitatively via 


the notion of a commutator. The commutator of two operators m , T 2 


is defined as 


[fiX] = f\T 2 — f 2 f\. 


(3.16) 


The knowledge of the commutator or, as it is sometimes called, a commutation 
relation between two operators is essential and, often, the most important informa¬ 
tion about operators that you can have. You will see throughout the course how the 
commutation relations of different operators are used in a variety of applications 
and calculations. 

Commutators have a few important properties, the most frequently used of which 
are the following: 


2 We all are used to deal with commutative multiplication of numbers: the result does not depend 
on the order, in which multiplication is performed. The lack of commutativity of multiplication 
was one of the features of the Heisenberg theory, which especially freaked out Schrodinger. 
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[Ti,T 2 ] = - [tlX] 

(3.17) 

Ti + T 2 , 73 ] = /,V 3 ] + [T 2 ,T 3 ] 

(3.18) 

/fi,c 2 f 2 j = cic 2 / ,r 2 ]. 

(3.19) 


The proof of all these identities is quite obvious, and I shall leave it for you as an 
exercise. 

Having defined a product of two operators, I can introduce a power function for 
the operators: T n simply means applying the same operator n times. The power 
function is important because it allows defining other, more complex, functions 
of the operators. In general, expression / where/( x) is an arbitrary function, 

which has infinitely many derivatives at x = 0, can be expended in the infinite 
Taylor series. Using this series one can define the operator function/ by simply 
substituting the operator instead of x in the series: 


/(*■) = £ 


n\ dx n x=0 


However, a number of important functions, which you are used to dealing with 
routinely, cannot be defined this way and, therefore, do not make sense for operators. 

Among them are Vf, In ^ T ^ and other similar functions with singularities at zero. 

An important exception is function T ~ l , called inverse operator, which is defined by 
equation 

TT~ l = f~ l T = /, (3.20) 


where I is a unity operator, i.e., an operator which does not change a vector it 
acts upon. The meaning of the inverse operator can be illustrated by the following 
expressions: 


T\a) = \P) 

T~ l \fi) = I a), 

where the second line is obtained from the first one by multiplying both sides of the 
latter by T ~ l . Finding inverse operators is usually a difficult task and often amounts 
to solving an entire problem. If an operator has a form of a matrix, its inverse can 
be found according to standard rules for inverting matrices. 

Finding inverse operators is significantly simplified for a special class of 
operators called unitary operators. These operators, defined by the condition 


t/t = u~\ 
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play an extremely important role in quantum theory (we value them, of course, not 
just because their inverse is easy to find). The main property of unitary operators 
is that they do not change the norm of the vectors or their inner products. Indeed, 


consider vectors |a) and |/3), and define new vectors |<5) = U\a) and 


U\P), 


where U is a unitary operator. Direct computation of 



proves this statement: 


(«I = («I & => («I P) = <«l &U\P) = (a\ U~ l U\p) = (a| p). 

Unitary operators are a generalization of the rotation operator acting on regular 
three-dimensional vectors: rotation of two vectors by the same angle does not 
change their lengths as well as an angle between them. As a result, the dot product 
of these vectors also does not change. Here is an example of a unitary operator based 
on the two-dimensional rotation matrix. 

Example 10 (Unitary Operators) Consider the well-known matrix used to relate 
the coordinates of a two-dimensional vector rotated by an angle 0 from its initial 
position: 


R = 


cos 0 — sin# 
sin 6 cos 0 


Its Hermitian conjugate is 


R f = 


cos 6 sin 0 
— sin 6 cos 0 


Simple computation shows that product R^R is a unity matrix: 


R f R = 

cos 6 sin 6 

cos 0 — sin 0 


— sin 6 cos 6 

sin 6 cos 6 


cos 2 0 + sin 6 — cos 6 sin 6 + cos 6 sin 6 


"l o" 

— cos 6 sin 6 + cos 6 sin 6 cos 2 0 + sin 6 


0 1 


This proves, of course, that R^ = J?” 1 . 

An important example of an operator function is an exponential function defined 
as 


exp 





(3.21) 
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Some of the familiar properties of this function remain valid even when its argument 
is an operator. For instance, the derivative of the expression / (A) = exp ^Ar) with 
respect to the parameter A is calculated as though T were a regular number: 

df/dX = T exp (Ar) . 

You should be warned, however, that a very convenient property of exponential 
functions 


exp (x + y) = exp (x) exp(y) 


(3.22) 


does not hold for operator arguments. One way to understand the reason for 
this unfortunate circumstance is to notice that if two operators T\ and T 2 in the 
argument of the exponential function exp (fi + T 2 ^ do not commute, expressions 

exp ^7)^ exp (7/) and exp ^7^ exp(7\) are not equivalent, so they both cannot be 
equal to the exponential of the sum of these operators. Generalization of Eq. 3.22 
to the case of operator arguments is, in general, very complicated and will not 
be considered here. There is, however, one case, when such a generalization has 
a relatively simple form and can be derived without too much efforts, while 
some work is still required, of course. This simplification takes place when the 
commutator of the operators T\ and T 2 commutes with both of them. In most cases, 
this means that the commutator is a regular number, but it does not have to be. 


So, suppose that the commutator of two operators T\ and T 2 is 


TuT 2 


= c, 


where C is such that 



t 2 ,c 


0. This assumption appears to be quite 


restrictive, but in reality, it is fulfilled in a great many pairs of operators that are 
important for quantum mechanics. In order to derive the promised generalization 
of Eq. 3.22, I have to, first, prove two intermediate identities, which, however, are 
useful in their own right. Let me begin by computing the following expression: 


OO | OO ^ yi 

n =0 n =0 


(3.23) 


To proceed I need to prove the following identity for the commutators: 

[f u i 2 n ]=nCi 2 n -\ (3.24) 

The easiest way to do it is to use the method of mathematical induction. For those 
who have forgotten how this method works, the first step is to prove the statement for 
the first nontrivial value of the index (n = 2 in this case). After that you assume that 
the statement is correct for n = k and, using this assumption, prove it for n = k + 1. 
Thus, the first step—consider n = 2: 
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[t 5 ! , 72 2 ] = Tif 2 2 - f 2 f\ = f x f 2 f 2 - f 2 T x f 2 + f 2 f x f 2 - f 2 f 2 f x 
= (t x T 2 - f 2 f x \ T 2 + f 2 (t x T 2 - f 2 f x \ = 2 CT 2 . 

(Note that it works because C commutes with T 2 .) Next, n = k assumption: 

[fuf 2 k ] = kCf 2 ~' 

The final step—proof for n = k + 1: 


T U T 2 


k+r 


= i x T 2 +l - T 2 +X T X = Tif 2 +1 - T 2 T\f 2 + f 2 T\f 2 - f 2 + 'f x 


T\,T 2 


t 2 + T 2 [f,, T 2 


= CT 2 + kcf 2 =(k+ 1) CT 2 . 


Using this identity I can transform Eq. 3.23 into 

00 ) ft—l 

E a 

; 

ft =0 


OO m OO 

.^] = cZ^fr =az 

ft =0 ft = 1 


f 2 ~ x = A Ce xf > 


(«- 1 )! 


(3.25) 


This result can be used to derive another important identity. Multiply Eq. 3.25 by 
£-Af 2 f r om the left: 


e~ xfl [Y,, e r, ' 2 J = e~ xfl XCe xfl . 

The right-hand side of this expression simplifies to CA: e~ XTl e XT2 = e~ XTl + XTl = 
/, since Eq. 3.22 is applicable for any commuting operators and any operator 
commutes with itself. Now you can expand the commutator on the left of the 
expression above to get 


e~ xil T x e xfl -Ti=CX 
or 

e~ xfl f x e xfl = f\+ CX. (3.26) 

Now I am ready to approach my main target and to prove that 

e Ti+T 2 _ e Tl e T2 e -\[Ti,T 2 ]' (3 27 ) 
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The proof of this identity is more involved than the two previous derivations. Direct 
proof (for instance, by using series expansions of the exponential functions on 
both sides of Eq. 3.27) results in expressions too cumbersome to allow for fruitful 
analysis. Therefore, I am going to use an indirect approach, which was invented by 
Harvard Professor Roy Glauber, winner of the 2005 Nobel Prize for his contribution 
in quantum optics. Glauber considered function/(v) = e xTl e xTl , for which he 
derived a differential equation by computing its derivative: 

^ + e* T 'T 2 e xf \ 

dx 

Note how operators T\ and T 2 are placed in this expression: T\ appears in front of 
the exponent containing T 2 because it originates from the exponential function of 
T\ positioned to the left of e xTl . At the same time, T 2 appears behind e xTl following 
the respective position of e xTl . Relative positions of e xTi and respective 7) are 
not important because these operators commute (any operator commutes with any 
function of the same operator). Now the derivative can be rewritten in the following 
way: 


^ = /r leX f 2 e -lr 2 e - x t t ff ie xT le xf 2 + /T t f 2 e xT 2 \ 

dx V / 

It is not too difficult to see that the expression in front of the brackets is equal to 
unity so writing it there does not change anything. Continue 

j x =f(x) ( e~ xTl f X e xf2 + r 2 ) =f(x) / + X [f u f 2 ] + T 2 ) , 

where the identity given by Eq. 3.26 is used and C is replaced with ^7\, T 2 J . This 
differential equation can now be solved for function/(v): 

f j = f *(fi +T 2 +»[fi, f 2 l) => 

where integration constant /o is chosen to satisfy the obvious initial condition: 
/(0) = 1. With this in mind, function/ can be written as 

f = ^(fi+f 2 )+^ 2 [f,,f 2 ] > 


Setting x = 1 in this expression and multiplying it by e ' Eq. 3.27 is finally 

obtained, completing the proof. 
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I want to finish this section with two important technical statements about 
Hermitian operators. The first one is concerned with Hermitian conjugation of a 
product of two Hermitian operators. It can be shown that 

( f x f 2 ) f = T 2 T x . (3.28) 

This statement can be proven as follows. By definition 

[cc\(t,T 2 )\p) = {[mt 2 \a)y. 

Introducing (/3|7j = (/3|, T 2 \a) = \a) and using Eq. 2.19, you can write the right- 
hand side of this expression as 

((^l«>) = (&\P). 

Rules for Hermitian conjugation yield |/3) = f\ \/3) and (a| = [a\f\, which allows 
to proceed as follows: 

((mUa))* = (()8| a))* = {a\p) = [a\t\f\\fi). 

By the way, you may have noticed that I had actually proved a more general 
statement. Indeed, the last equation means that 

(f 1 7’ 2 ) t = f\f\, 

which is valid for any linear, not necessarily Hermitian, operator. Equation 3.28 
follows from this result if 7j and T 2 are Hermitian. An immediate corollary of this 
result is the following: 


[TuT 2 ]' = -[li.Ti]. (3-29) 

Operators which change sign upon Hermitian conjugation are called anti-Hermitian, 
so the commutator of two Hermitian operators is also anti-Hermitian. It is now easy 
to demonstrate that a commutator of two Hermitian operators can be presented as 


3.2 Operators in Quantum Mechanics 


55 


where A is Hermitian. If the commutator is a number, Eq. 3.30 is reduced to 



(3.31) 


where c is real. 


3.2.3 Eigenvalues and Eigenvectors 

When an operator acts on a generic vector, the result is a different vector. For 

_ 2 ~ _ 2 _ 2 

instance, differentiation operator acting on function e x \ De x = —2xe x — 
produces a different function. If, however, you apply the same operator to function 
e KX , the result will be the same function, multiplied by a number: De KX = ice KX . This 
example illustrates a general phenomenon: among many vectors that are changed 
by operators in completely different vectors, there are some that are only being 
multiplied by a number. This special class of vectors, called eigenvectors, plays 
an important role in the application of operators in quantum physics. The number, 
which appears as a factor in front of an eigenvector, is specific for each vector (or 
a limited subset thereof) and is called an eigenvalue. The formal definition of an 
eigenvector and an eigenvalue is as follows: vector |a) is an eigenvector of operator 
T with a respective eigenvalue X a if 

T\a) = X a \a) . (3.32) 

For each eigenvector there might be one and only one corresponding eigenvalue, 
but the opposite of this statement is not always true. If for each eigenvalue there 
exists only a single eigenvector, we describe this eigenvalue as non-degenerate. If 
an opposite happens, and several eigenvectors “belong” to the same eigenvalue, the 
respective eigenvalue is naturally called “degenerate.” In the non-degenerate case, 
an eigenvalue describes a respective eigenvector with an accuracy to a constant 
factor (a vector appearing in Eq. 3.32 can be multiplied by any number without 
destroying the equation). If we, however, require that all eigenvectors be normalized, 
then the eigenvalue will define the respective eigenvector uniquely (with accuracy 
to an arbitrary phase factor, which cannot be fixed by normalization but which does 
not affect any physical results) so that I can designate it simply as |A). 

To distinguish between different eigenvectors belonging to the same eigenvalue, 
I need an additional index so that Eq. 3.32 becomes 

T\X,n) = A |A,/x>. (3.33) 

The physical meaning of the additional index will become clear later, but for 
now, it is just a way to distinguish between different eigenvectors belonging to 
the same eigenvalue. An important property of degenerate eigenvectors is that any 
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linear combination of these vectors is again an eigenvector belonging to the same 
eigenvalue. Indeed, consider a vector 

\a) = a w \X,Hi) + a ^ 2 \X ,/* 2 ) 


and apply operator T to it 

T\a)=T (a Ml \X,fii) + a^ 2 \X,fi 2 )) = a^X \X,fii) + a^X \X,fi 2 ) = X |a) 

where I used Eq. 3.33. Using mathematical lingo, you can say that eigenvectors 
belonging to a degenerate eigenvalue form a subspace of the total linear space 
because by forming any linear combination thereof you remain within the same 
set of vectors in complete agreement with the definition of a vector space. 

Now I shall prove an important theorem concerning eigenvectors of commuting 
operators and discuss its consequences. 

Theorem 1 (Eigenvectors of Commuting Operators) Consider two operators T\ 
and T 2 such that T\T 2 = T 2 T\. Also assume that X T{ is a non-degenerate eigenvalue 
ofT\ with eigenvector \X Tx ). Then, this vector is also an eigenvector of the operator 
T 2 . 

Proof Consider 


T 2 T x \X Tl ) = X Tl T 2 \X Tl ) = T x T 2 \X Tl ) 


where at the last step I used the commutative property of the operators. The obtained 
result means that T 2 \X T] ) is also an eigenvector of T\ with the same eigenvalue X T] . 
However, since it was assumed that X Tx is non-degenerate, this new eigenvector 
might differ from | X Tl ) only by a constant factor: 

T 2 \Xt x ) = X Tl \X Tl ), 

which means that \X T] ) is an eigenvector of T 2 . 

The non-degenerate nature of the eigenvalue of T\ is essential for this proof to 
work. Thus, if eigenvalues of T\ are degenerate, not all eigenvectors of T\ will also 
be eigenvectors of T 2 . However, it can be proven (though the proof is much more 
involved and will not be reproduced here) that one can always form such a linear 
combination of these degenerate eigenvectors which will become an eigenvector of 
T 2 , with its own eigenvalue Xj 2 . In this case, assigning eigenvalues of both T\ and 
T 2 might provide a unique characterization of a vector, which is a simultaneous 
eigenvector of both operators and can be notated as \X T] , X Tl ). Comparing this 
notation to Eq. 3.33, one can see that the index pi in that equation can be understood 
as an eigenvalue of a commuting partner operator. If there exists a third operator, 
r 3 , commuting with both T\ and T 2 , one can find common eigenvectors for all three 
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operators, in which case a full unique characterization of such a state would require 
specifying three eigenvalues: \Xj x , Xj 2 , Xj 3 ), where 


T\ \X Tl , X Tl , X T3 ) = X Tl \X Tl , X Tl , X T3 ) 
T 2 \Xj x , Xj 2 , Xt 3 ) = Xj 2 \Xt x , Xt 2 , Xt 3 ) 
T 3 \Xt x , Xt 2 , Xj 3 ) = Xj 3 \Xt { , Xt 2 , Xt 3 ) 


In general, in order to fully uniquely characterize an eigenvector of an operator with 
degenerate eigenvalues, one needs to find the complete set of commuting operators 
(CSCO), i.e., all operators which commute with each other. 

To help you visualize these rather abstract concepts, I will illustrate them with 
a simple example involving commuting matrices, but you have to be prepared for 
some lengthy computations. So, embrace yourself! This example will also illustrate 
the process of finding eigenvalues and eigenvectors of operators in a matrix form. 

Example 11 (Eigenvectors of Commuting Matrices) Consider two 3x3 matrices 


Ml 


“ 5 

4 

1 

2\p2 

1 “ 

4 


1 

1 

V2 

-1 

1 

3 

1 

; Ml = 

1 

0 

1 

2^2 

2 

2^2 

V2 

V2 

1 

1 

5 


-1 

1 

1 

4 

2^2 

4 


V2 


(3.34) 


It does not take much effort to compute their products (you can use symbolic 
computational platform such as Mathematica or Maple if you are too lazy to do 
it yourself) and to see that the matrices, indeed, commute: 


Ml • M2 = M2-Ml 


3 3_ 

4 2^2 

3 _ 1 
2^2 2 

_5_3_ 

4 2^2 


_5 ■ 

4 

3 

2^2 

3 

4 


Vectors in this case are single columns with three elements: 


U\ 

u 2 

u 3 


and the eigenvector equation 3.32 takes the form of a matrix equation. For Ml this 
equation is 


- 5 _1_ 1 - 

4 2^2 4 


U\ 


U\ 

1 3 _1_ 

2V2 2 2V2 

I _!_ 5 


U2 

= A 

U2 

- 4 2^2 4 _ 


_U 3 _ 


_U 3 _ 
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It is convenient to collect all terms on one side and present this equation in the form 


r 5 ; 1 1 1 

4 A 2V2 4 


U\ 

_J_ 3 _ A _L_ 

2V2 2 2V2 


U 2 

1 1 5 ; 

_ 4 2 V 2 4 A _ 


_U 3 _ 


(3.35) 


What we have here is a matrix form of a system of three linear homogeneous 
equations, which always has at least one solution: u\ = U 2 = U 3 = 0. This solution, 
however, is not what I had in mind when introducing the concept of eigenvectors. 
We need non-zero solutions, but they might exist only if the determinant of the 
matrix representing coefficients of this equation is equal to zero. (Cramer’s rule of 
linear algebra, anyone?) Computing the determinant and setting it to zero, I arrive 
at the following equation: 


A 3 - 4A 2 + 5A - 2 = 0 , 

which has three solutions, = 1 ; A 3 = 2 , two of which coincide signifying 
that the matrix does have degenerate eigenvalues. (These solutions can be found by 
factoring the determinant as (A — l ) 2 (A — 2 ).) 

Now, for each eigenvalue, I will find a respective eigenvector, beginning with a 
non-degenerate eigenvalue A 3 = 2. Substituting this eigenvalue in Eq. 3.35, 1 reduce 
it to 


~ _3 _J_ 1 " 


r„( 3 n 

4 2^2 4 


U | 

1 _1 _J_ 


rA 3 ) 

2^2 2 2V2 


U 2 

1 _j_ _3 


(3) 

_ 4 2^2 4 _ 


_ w 3 _ 


where added upper index in uf ] indicates that this eigenvector belongs to the third 
eigenvalue. Expanding the matrix equation in an explicit system of linear equations 
yields 


— uP H- ~F U P 4— U P — 0 => —3 up + Vluf + up = 0 

4 1 2^2 4 3 1 23 

1 (3) 1 (3) , 1 (3) n , (3) nz (3) . (3) n 

2^2 1 2 2 2\/2 3 123 

-up H- t u 2 ] - u f ^ — 0 U P + V2up — 3 uf ] = 0. 

4 1 2 V 2 2 4 3 1 2 3 


13) 13) 13) 13) 

Combining the last two equations, I get 2u\ — 2u\ = 0 =>u\ = u\ . Then, the 

first two equations are reduced to two identical equations: 
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—2 + V2*4 3) = 0 

2i<[ 3) - = 0, 

(3) 

which means that the value for one of the coefficients u\ 2 can be chosen arbitrarily. 

(3) 

For instance, you can express these coefficients in terms of yet undefined u\ : 
uf- = y/luf-; uf^ = uf\ Using notation \2) to designate this eigenvector (2 
in this notation refers to the value of the respective eigenvalue), I can write 

" 1 " 

12 ) = uf ] ^2 • 

_ 1 _ 


The value of the remaining coefficient can be fixed (if the undefined coefficients 
make you nervous) by requiring that the vector is normalized: 






1 

2’ 


Thus, the normalized eigenvector belonging to the eigenvalue A = 2 is found to be 



(3.36) 


Now let me deal with degenerate eigenvalue A 1,2 = 1. In this case, the eigenvector 
equation becomes 

“ l l l 
4 2y/l 4 
1 1 _ 1 _ 

2<s/2 2 2y/2 

1 _J_ 1 

_ 4 2V2 4 

or in the expanded form 
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1 ( 1 , 2 ) 
-u\ 

4 1 


+ 




2V2 


-4 U) = o 

4 3 


,( 1 , 2 ) 


+ 


V2 u?' 2) 


u 


( 1 , 2 ) _ 


0 . 


In this case, all three equations coincide, meaning that I can choose arbitrarily two 
coefficients, e.g., and u^ ,2 \ while expressing the remaining coefficients as 


u 2 ^ = — /V2. 

Choosing different values of the remaining coefficients, I can generate different 

( 2 ) 

eigenvectors all belonging to the same eigenvalue. For instance, choosing ^ 3=0 
and = 0,1 generate distinct vectors: 



0 


1 

II 

R 

1 

V2 

; |1} 2 = 

1 

V2 


1 


0 


which can also be normalized. Any linear combination of these vectors will also be 
an eigenvector. 

Now I turn my attention to matrix M2. Again computing the determinant 


1 - A 

1 

-l 

V2 

i 

-A 

i 

V2 

V2 

-1 

i 

V2 

1 -A 


and setting it to zero, I end up with the equation 

A 3 - 2A 2 - A + 2 = 0 

which again can be solved by factorization and yields = 2, X2 = —1, X3 = 
1. Each of these eigenvalues (which, by the way, are non-degenerate) has its own 
eigenvector, which can be found in the same way as above. I will leave the actual 
calculations as an exercise and present here only the final answers for the normalized 
eigenvectors: 


12 ) 


1 

V5 


"-I" 

• I- 1 * = ~2 

1 

,n>4 

1 

0 

V2 

-V2 

1 


1 


1 


(3.38) 


where eigenvectors are again labeled by their respective eigenvalues. Now, it is 
obvious that eigenvector | —1) of matrix M2 is also an eigenvector of Ml, so I only 
need to check the remaining vectors: 
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“ 5 

4 

1 

2^2 

1 " 

4 


~-l" 


~-l~ 

1 

2^2 

3 

2 

1 

2^2 


0 

= 

0 

1 

4 

1 

2y/l 

5 

4 


1 


1 


so this vector is an eigenvector of Ml with eigenvalue A = 1. Note that the 
elements of this vector obey condition = — (uf ,T} + derived for 

the degenerate eigenvectors of Ml with w 2 = 0, u\ = — = 1. Now, for the 
remaining eigenvector of M2,1 have 


- 5 _L_ I " 


1 

-V2 


1 

-V2 

4 2^2 4 

1 3 _1_ 

2^2 2 2y/2 


= 

1 _J_ 5 


1 


1 

_ 4 2 V 2 4 _ 




i.e., this is also an eigenvector of Ml with the same eigenvalue. For this vector 
I also have u 2 = — {u\ + u^) / \fl with u\ = w 3 = 1. Thus, I can present 
the system of common eigenvectors of these two matrices, in which degenerate 
eigenvectors become uniquely defined by the virtue of their belonging to the 
eigenvalues of a second commuting matrix. Now, all these eigenvectors can be 
designated as 11,2) , 11,1), and |2, —1), where the first and second numbers refer to 
the eigenvalues of Ml and M2, respectively. 


3.3 Operators and Observables 
3.3.1 Hermitian Operators 

One might notice a striking similarity between CSCO and the concept of the 
complete set of mutually consistent observables discussed in Sect. 2.1. Also, the 
state vectors characterized by definite values of compatible observables look like 
eigenvectors of operators characterized by eigenvalues of commuting operators. 
It appears reasonable, therefore, to expect that one can establish a connection 
between physical observables and quantum states characterized by the values of 
the observables on one hand and the mathematical concepts of operators and 
their eigenvalues and eigenvectors on the other hand. This connection is indeed 
established by the following postulates laying down the foundation of formalism of 
quantum mechanics. 

Postulate 1 (Observables and Hermitian Operators) Every observable is | 

represented in quantum theory by a Hermitian operator. I 
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Postulate 2 Eigenvalues of operators constructed to represent an observable 
determine values, which a measurement of the observable might yield, and 
eigenvectors define states, in which a measurement of the observable represented 
by the operator will with certainty produce the corresponding value. 

The first question which might pop up in someone’s mind after reading the first 
of these postulates is, why does it single out Hermitian operators? The fact of the 
matter is that Hermitian operators possess a number of special properties, which 
make them practically suitable for their intended use as representative of physical 
observables. These properties can be formulated in the form of several theorems. 

Theorem 2 (Theorem of the Eigenvalues) Eigenvalues of Hermitian operators 
with discrete spectrum are necessarily real-valued. 

Proof Let \X n ) be an eigenvector of a Hermitian operator T corresponding to 
eigenvalue X n \ 


T\X n )=X n \X n ). 


Premultiplying this expression by (X n \, I get 


(X n \ T \X n ) — X n (X n \X n ). 


Performing complex conjugation of this expression and using the definition of the 
Hermitian conjugate operator, Eq. 3.14, 1 derive 



where it is assumed that the norm of the vector exists and is a real-valued quantity. 
For Hermitian operators T ^ = T , in which case left-hand sides of the last two 
equations coincide yielding A* = A n , which means, of course, that X n is a real 
number. 

The importance of this theorem for association between physical observables 
and operators is obvious—results of any measurements are always expressed by real 
numbers, and the theorem guarantees that the mathematical constructs (eigenvalues) 
used to connect the formalism with the real world of experiments and observations 
are consistent with this natural requirement. The assumption that the norm of 
the respective eigenvectors exists, which is a critical element of the proof of the 
theorem, can be rigorously validated only for Hermitian operators with discrete 
spectrum. 3 

Eigenvectors of operators with continuous spectrum are not normalizable in the 
usual sense (see Sect. 2.3), so this theorem does not apply to them. At the same 
time, we need such continuous spectrum operators as momentum or coordinate 


3 1 borrowed this fact without proof from the branch of mathematics called functional analysis that 
studies the properties of linear operators. 
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to describe physical reality, so we have to find a way to avoid having to deal 
with unrealistic complex eigenvalues. Leaving the mathematical intricacies of this 
problem to mathematicians, I solve it here by a sleight of hand. I simply postulate 
that only real eigenvalues and their corresponding eigenvectors of such operators 
can be used to represent quantum states and the results of measurements. It can 
be shown that the eigenvectors corresponding to real eigenvalues of Hermitian 
operators with continuous spectrum can be normalized in the sense of Eq. 2.41. To 
illustrate the last point, consider operator id/dx that I have previously proved to be 
Hermitian. The eigenvectors of this operator have the form of e~ lkx , with k being an 
eigenvalue: 


id(e~ ikx )/dx = ke~ ikx . 

If I force k to be a real number, I can use the properties of the delta-function to write 

J dxe ix(k ~ kl) = 2 n8 (k - jfej), 

which is the orthonormalization requirement for the eigenvectors belonging to 
continuous spectrum. You may want to notice that the integral in this expression 
is reduced to the delta-function only for real-valued k. 

Theorem 3 (Theorem of Eigenvectors) Eigenvectors of Hermitian operators with 
discrete spectrum belonging to different eigenvalues are necessarily orthogonal. 

Proof Consider two different eigenvalues X\ and A 2 of a Hermitian operator T 
together with their eigenvectors |Ai) and |A 2 ): 

T\X\) = X\ \Xi) 

T \X 2 )=X 2 \X 2 ). 


Premultiply first of these equations by (A 2 | and the second one by (X\ |: 

(A 2 | T\X\) = X\ (X 2 |A0 
(Ai| T\X 2 ) = X 2 (Xi |A 2 ). 


Complex conjugate the second of these equations, use Eq. 3.14 (which defines the 
Hermitian conjugate operator), and take into account that T is Hermitian. This yields 

(A 2 |f|A 1 ) = (A 2 )*(A 1 |A 2 )*. 

so that the pair of equations from above can be written as 


(A 2 | T |Ai) = Ai (A 2 |Ai) 
(A 2 | T |Ai) = (A 2 )* (Ai |A 2 ) 


64 


3 Observables and Operators 


Taking into account that the eigenvalues of the Hermitian operators are real and that 
according to the property of the inner product (A 2 |A 1 ) = (Ai |A 2 )*, you finally 
obtain 


Ai (A 2 I Ai) — A 2 <A 2 I Ai) . 


If Ai ^ A 2 , you have no choice but to conclude that (A 2 |Ai) = 0. 

In the case of Hermitian operators with degenerate spectrum, the situation is more 
complex because, as we saw in the matrix example in Sect. 3.2.3, one can generate 
multiple sets of linearly independent vectors belonging to the same eigenvalue, 
and they do not have to be orthogonal. At the same time, we also saw that one 
can always find such a set, in which eigenvectors are orthogonal. These special 
sets of orthogonal vectors belonging to the degenerate eigenvalues are usually also 
eigenvectors of another operator from the respective CSCO. Thus, you can be rest 
assured that for any Hermitian operator, there exists a set of mutually orthogonal 
eigenvectors. I already mentioned that the physical meaning of the mathematical 
concept of orthogonality is mutual exclusivity of values of the observables used to 
characterize the states, and this comment essentially completes our identification 
of mutually exclusive states characterized by a set of mutually consistent set of 
observables with eigenvectors of operators belonging to a complete set of mutually 
commuting operators. 

Theorem 4 (Completeness of Eigenvectors) The set of eigenvectors of Hermitian 
operators is complete in a sense that any state in the respective Hilbert vector space 
can be presented as a linear combination of these eigenvectors. 

The completeness property gives a rigorous mathematical justification to the 
generalization of the superposition principle expressed by Eq. 2.26. This property 
essentially states that eigenvectors of Hermitian operators with discrete spectrum 
form a countable basis in the Hilbert vector space. It can also be expressed in the 
form of a so-called completeness or “closure” relation, which can be presented as 
a useful operator identity. To derive it, I, first, rewrite Eq. 2.26 in a more compact 
form as 


l°0 — y ] a n I Xn) > 
n 


(3.39) 


where index n enumerates the eigenvectors and each eigenvector | /„), which is 
assumed to be normalized, is characterized by all available eigenvalues of the 
respective CSCO. Expansion coefficients a n in this expression can be found as 
a n = (Xn \ ct) as established in Eq. 2.24. After substitution of this expression back 
into Eq. 3.39, the latter becomes 


l«) = \Xn) {Xn \«) 

n 



a). 


(3.40) 
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In the last expression here, I split off ket vector |cu) from the bra (x n \ and combined 
the latter with another ket | /„). The ket and bra vectors enclosed in the brackets 
are in unusual positions: the bra is on the left of the ket, which is opposite to 
their regular positions in the standard inner product. As you can guess, expression 
P (H) = I Xn) (Xn | is not an inner product, but does it have any sensible meaning 
at all? In the matrix example of the vectors, this expression corresponds to the 
situation in which the column vector is written down to the left of the row vector— 
the arrangement used to form the outer or tensor product mentioned in the previous 
section. Respectively, in the case of abstract generic ket and bra vectors, | Xn) {Xn I 
can be understood as an outer product of two vectors. Naturally, just as the outer 
product of rows and columns yields a matrix, the outer product of bras and kets 
generates an operator: indeed, if you bring the split-off ket vector back, you can 
construct the following expression: 

P {n) \ot)= \Xn)(Xn\4 (3.41) 

i.e., the result of the action of on |a) is vector \x n ) multiplied by a number. 
If |a) and | /„) were a regular three-dimensional vector and one of the unit vectors 
specifying a particular direction correspondingly, you could say that projects 
\a) on \Xn) and generates a component of |a) in the direction specified by |/„). It 
is customary to maintain the same terminology and call operator a projection 
operator. 

Example 12 (Projection Operators) To get accustomed to working with operators 
of the form P^ = \x n ) (/«|, let me prove the main property of the projection 

operators, j^P^J = P^\ 


^ (B) ] = \Xn) {Xn\ Xn) ( Xn\■ 

The expression in the middle looks like an inner product of a basis vector with itself, 
and as such it is equal to unity. Thus, we have 

> } ] 2 = I Xn) (Xn\ = P (U) . 

The expression inside the parentheses in Eq. 3.40 is a sum of projection operators, 
but most importantly, it is easy to see that this sum is identical to a unity operator: 
it acts on vector \ot ) and generates the same vector. This statement can be written as 
the following identity: 


J2^n)(Xn\=l (3.42) 


which is the completeness or closure relation. This is a useful operator identity, 
which will be frequently used in what follows. 
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Not all vector spaces used in quantum mechanics can be described by a discrete 
basis, and sometimes we have to use as a basis eigenvectors of operators with 
continuous spectrum. I have already discussed this possibility in Sect. 2.3 using 
states characterized by a definite value of particle’s position | r). Now you can 
associate these states with eigenvectors of a position operator r. In general, if | q) 
is an eigenvector of some Hermitian operator with continuous spectrum and q is the 
respective eigenvalue, you can present an arbitrary state la) as an integral instead of 
a sum: 



(3.43) 


Premultiplying Eq. 3.43 by bra (q\\ and using the orthogonality condition for 
continuous spectrum, Eq. 2.41, you will obtain 



Replacing \)/ (. q ) in Eq. 3.43 with its expression derived in Eq. 3.44, you end up with 



Considering expression f dq\q) (q\ as an operator, you can, similarly to the case of 
discrete basis, write 



(3.45) 


Equation 3.45 constitutes a completeness condition for eigenvectors of operators 
with continuous spectrum. 

Example 13 (Expansion in Terms of Continuous Basis) To illustrate Eq. 3.43, 
consider again a linear vector space of integrable functions of a single variable: 
|a) = f(x). The Fourier transform of this function can be defined as 


oo 



—oo 


where the “coefficient” function f(k) is defined via the inverse transform 


oo 



—oo 
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The role of the continuous basis is played here by functions 


I*) 


_ \ _ e ikx , 

\fl71 


which are eigenvectors of Hermitian operator —id/dx with continuous spectrum 
consisting of real numbers k. These eigenvectors are orthogonal and delta-function 
normalized: 


— [ dxe m ~ k)x = 8 (k- ki). 

2n J 

—oo 

The completeness condition, Eq. 3.45, for these functions takes the form of 

oo 

— [ dke i(xi ~ x)k = 8 (x - x 0, 

2 71 J 


with delta-function 8(x — x\) playing the role of the identity operator I in this space: 

//(,)«(,-„)* =/<*,>. 

Some operators have a mixed spectrum: it is discrete for one range of eigenvalues 
and continuous for another range. Completeness relation in this case will be a 
combination of Eq. 3.42 and Eq. 3.45 with sum over all discrete eigenvectors and 
the integral over the continuous one. 


3.3.2 Quantization Postulate 

Most physical observables can be constructed from just two elements: position 
vector r and momentum p. I have already introduced states with definite values of the 
position vector, |r), which are supposed to be eigenvectors of a respective Hermitian 
operator r. Similarly, I can introduce states with definite values of momentum | p), 
which are supposed to be eigenvectors of the Hermitian momentum operator p. The 
first question, of course, which you shall want to know is what these operators do 
to quantum states. You could have guessed the answer for the states represented by 
eigenvectors of respective operators: r\r) = r \r) ,p \p) = p \p), where I placed ~ 
above r and p to better distinguish between symbols of respective operators and their 
eigenvalues and eigenvectors. Using these results I can compute expressions like 
r\a) or p la) by expanding the state |a) in terms of eigenvectors of the respective 
operators. For instance, by presenting 

la} = f drif (?) | r), 
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I can find 


r\a) = j drift (?) r \?) = J drift (?) ? \?). 

Similar treatment for the momentum operator yields 

p\a) = J dpcp (p)p ]p) = J dpq> (p)p \p). 

The problem arises when both position and momentum operators appear in the 
same expression and we have to figure out how to operate, say, on a state 
expanded in terms of eigenvectors of r or vice versa. I will discuss this issue 
later in the book, in the section devoted to “representations” of the state vectors 
and operators. For now I would just like to say that the solution to this problem 
depends on the fundamental assumptions about commutation relations involving 
position and momentum operators. Essentially, the quantization procedure, i.e., the 
rules determining how to replace classical observables with their representation as 
quantum operators, consists in the postulation of these commutation relations. You 
will see many times in this text that the knowledge of the commutators of various 
operators is all what you need to know to perform quantum mechanical calculations. 
So, please meet the fundamental commutation relations of quantum mechanics. 


Postulate 3 (Quantization Postulate) Operators, corresponding to various 
Cartesian components of position vector and momentum, obey the following 
commutation relations: 


[h,rj\ = 0; [p„pj\ 

\h:Pj\ = iMij, 


0 


(3.46) 

(3.47) 


where subindexes take values 1,2,3 indicating x,y,z Cartesian components of 
the position and momentum vectors, respectively. 


The first of the commutators in Eq. 3.46 indicates that the Cartesian components 
of the position vectors are mutually consistent observables. In other words, it means 
that if a system is in the state with a certain position, all three components of the 
position vector are well-defined. The same is true for the vector of momentum as 
expressed by the second of the commutators in Eq. 3.46. These commutators reflect 
our desire bom out of empirical experience for the position and momentum of the 
quantum systems to be genuinely well-defined quantities, at least when measured 
independently of each other. 

The commutators presented in Eq. 3.47 are often called canonical commutation 
relations, and they also express our empiric experience, namely, the fact that the 
same Cartesian components of position and momentum vectors of a quantum system 
are not mutually consistent observables and cannot, therefore, be described by 
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commuting operators. The actual form of the commutator is chosen to reproduce 
Heisenberg’s uncertainty principle, which is discussed in the next section. You 
will also see later that the empirical foundation for this form of the commutator 
can be traced to the de Broglie relation, Eq. 1.3. It is interesting to note a striking 
similarity between commutators given in Eqs. 3.46 and 3.47 and canonical Poisson 
brackets of classical mechanics, Eq. 3.5. This similarity lies in the foundation 
of the so-called canonical quantization rule: any classical conjugated quantities 
satisfying Eq. 3.5 in quantum theory are promoted to quantum operators obeying the 
canonical commutation relation 3.47. Therefore, canonically conjugated variables 
never belong to the same class of mutually consistent observables and are found on 
the opposite sides of the Bohr complementarity principle. 


3.3.3 Constructing the Observables: A Few Important 
Examples 

Using coordinate and momentum operators, I can construct operators for other 
observables, which is done according to the standard quantization rule. 

Quantization Rule To turn a classical observable into an operator, replace | 
all coordinate and momentums appearing in its classical definition with corre-1 
sponding operators respecting the requirements of hermiticity and the order of I 
multiplication, when necessary. 1 


In many situations, the issues related to hermiticity or to the multiplication order 
of observables are resolved automatically, but in some cases one needs to pay special 
attention to them. To have you started, consider several simplest examples. 

Kinetic Energy 

Kinetic energy of a single particle with mass m e is described by operator 



which is obtained from the corresponding classical expression by replacing classical 
momentum with the momentum operator. The eigenvectors of this operator coincide 
with the eigenvectors of the momentum operator, and its eigenvalues, which form 
a continuous spectrum, provide values of kinetic energy that can be observed for a 
system under study. 

Potential Energy 

Potential energy is obtained from the respective classical potential energy func¬ 
tion by replacing classical coordinate argument of the function with its operator 
equivalent: £/(/•)—► U ( r ). It is assumed here, of course, that the potential energy 
function can be presented as a series of positive and negative powers of r, in 
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which case the corresponding operator expression would have an easily identifiable 
meaning. Examples of such transformations are one-dimensional harmonic potential 
( kx 2 -> kx 2 ) and Coulomb potential (k/r kr~ l ), where r is the absolute value 
of the position vector. 4 The eigenvectors of this operator are the same as of the 
position operator, and the respective eigenvalues determine the possible values of 
the potential energy of the system. 

Hamiltonian 

Hamiltonian, which in classical mechanics is defined as the energy of the system 
expressed in terms of canonically conjugated coordinate and momentum, in quan¬ 
tum mechanics becomes, in a single particle case, an operator of the form 

/V p 2 

H=%- + U (r) . (3.48) 

2m 

Since position and momentum operators do not commute, the eigenvectors of 
the Hamiltonian are usually different from the eigenvectors of both position 
and momentum operators. Eigenvalues of Hamiltonian can belong to discrete, 
continuous, or mixed spectrum and determine the values of energy, which the system 
can have in the given environment. This is the most important operator in all of the 
quantum physics: just like classical Hamiltonian, its quantum counterpart controls 
the dynamics of the quantum objects. 

Angular Momentum 

Angular momentum is a very special kind of an observable. Classical angular 
momentum is a vector defined as a cross product of the position and momentum 
operators L = r xp. The quantization rule requires that the quantum mechanical 
angular momentum operator is constructed by promoting position and momentum 
vectors to the corresponding operators: 

L = rxp. (3.49) 

However, since this expression involves the product of the potentially non¬ 
commuting operators, one has to be careful with the order of the multiplication. 
One also needs to make sure that the resulting operator is Hermitian. To address 
both these concerns, I will expand the angular momentum vector in its Cartesian 
components: 


L x = Wz ~ ZPy 


(3.50) 


4 This transformation is not as trivial as it might seem since taking absolute value of a vector 
involves operation of square root, which is not well defined for operators. Practically it is not 
a problem, however, because usually one works in the basis of the eigenvectors of the position 
operator, in which case r ~ 1 becomes simply 1/r. If you are not concerned with any of this, this 
note is not for you. I mention it here simply in order to avoid accusations in sweeping something 
under the rug. 
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Ly = ZPx ~ W z 

L z = xp y - yp x . 


(3.51) 


(3.52) 


(One can use as a useful mnemonic device representation of the vector product as a 
determinant: 


rxp = x y z 

Px Py Pz 


where the first line is formed by unit vectors defining corresponding axes of a 
Cartesian coordinate system.) 

The first thing to notice in Eqs. 3.50-3.52 is that operators that are actually being 
multiplied correspond to commuting components of the position and momentum 
vectors; thus, the order, in which you place these operators, is not important. Next, 
you need to verify that each of the components of the angular momentum operator 
is a Hermitian operator. Hermitian conjugation, e.g., on x-component yields 


K = ( Wz ) f - (Zpyf = Pz5 - PyZ = L x 


proving hermiticity of this operator. Similarly, you can demonstrate the Hermitian 
nature of two other components. The most unusual property of the angular 
momentum, however, is that different components of the angular momentum do 
not commute. To illustrate this point, compute commutator : 






Wx (PzZ - ZPz) + PyX &z - PzZ) = ifl (PyX ~ Wx) = &U . (3-53) 


where, when transitioning from the second line to the third, I took into account that 
different components of the coordinate and momentum operators do commute, so 
that their order can be changed at will. Similarly, you will find (do it!) 



(3.54) 



(3.55) 
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These results indicate that the vector of the angular momentum in quantum theory 
is quite different from regular classical vectors as well as from vector operators 
of position and momentum: different components of this vector do not belong to 
the same group of mutually commuting operators and do not represent mutually 
consistent observables, meaning that this vector is not really well-defined. More 
specifically, if a quantum system is in a state in which one of the Cartesian 
components of the angular momentum is known with certainty, measurements of 
two other components will produce statistically uncertain results. This conclusion, 
in addition to making the direction of the angular momentum vector uncertain, also 
raises a question about its magnitude. Indeed, the magnitude of a generic classical 

3-D vector is defined as \A\ = yU 2 + A 2 + A 2 . Formal quantization of this 

expression is not possible because the square root of an operator Ja 2 + A^ + Aj 
is not a well-defined object. In the case of position and momentum operators, 
this problem did not arise because different components of these operators are 
commuting so that one can always choose a coordinate system in which all but one 
component of the position or momentum operators are equal to zero. The possible 
values of the remaining non-zero component will define the magnitude of the entire 
vector. This approach is not possible in the case of angular momentum because of 
the incompatibility of its components. This problem is circumvented by choosing 
the operator of the square of angular momentum defined as 


L 2 x +L 2 y +L 2 z 


(3.56) 


to represent its magnitude. Computing commutators 


L 2 ,L ; 


*x,y& 


you will find that all 


three commutators vanish. (The proof of this statement is left to you as an exercise.) 
This means that operators of the square of the angular momentum and one (any) 
component of the angular momentum are compatible observables, so that a quantum 
system can be created in a state in which one of the components and the magnitude 
of the angular momentum are known with certainty. Obviously such a state would 
be a common eigenvector of L 2 and L z . 


Quantization of p • r 

As a last example, consider a classical expression of the form p • r, which appears in 
some applications. An attempt to directly transform this expression in the quantum 
form by promoting the momentum and position vectors to operators faces two 
obstacles. First, the operators in this expression do not commute, and so it is 
unclear what is the correct order of multiplication. Second, even if I arbitrarily 
impose a particular order, say, p • r, the resulting operator is not Hermitian because 
{p • r)^ = r p 7 ^ p -r. To carry out the quantization procedure in this case, you need 
to come up with an expression, which would coincide with its original classical 
version but would not depend on the order of the operators, and be Hermitian. One 
way to achieve this is to introduce operator 


1 

2 


(p-r + fp ) 
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which satisfies all these conditions. However, this quantization procedure is not 
unique, and it might (and does) create problems down the road, but luckily for us 
this is not the road I choose for us to travel. 


3.3.4 Eigenvalues of the Angular Momentum 


The operators of the angular momentum play an extraordinary role in quantum 
theory, both on the fundamental level and for applications. The fundamental role 
of the angular momentum is derived from its relation to the rotation operator and 
rotational symmetry of quantum systems, but discussion of this topic is well above 
your pay grade. Those interested in the topic are free to consult any graduate level 
quantum mechanics text. From the point of view of applications, the importance 
of the angular momentum stems from the fact that many fundamental interactions 
in nature are described by so-called central potentials. The potential energy of 
such interactions depends only on the absolute value of the distance between two 
interacting particles, but not on the orientation of the vector of their relative position. 
This text is mostly concerned with quantum mechanics of a single particle in an 
external potential (a two-particle problem can often be presented in this form as 
well). If the external potential belongs to the class of central potentials, it can be 
shown that the Hamiltonian of such a system commutes with all components of the 


angular momentum as well as with operator L 2 . The proof of this statement requires 
proving it separately for kinetic energy operator (essentially for operator p 2 ) and for 
the potential energy operator V (r). I believe that the readers of this text are already 


equipped to prove that L xy z ,p 


L\p 


■]- 


0, so I leave it to you as an exercise. 


As far as the commutators with the potential energy operator go, this proof will have 
to be left till later. 

Vanishing of the commutators of angular momentum operators and the Hamil¬ 
tonian means that the Hamiltonian, L 2 , and one of the components of the angular 
momentum form a system of commuting operators and that the eigenvectors of L 2 
and, say, L z are also eigenvectors of the Hamiltonian. This fact can significantly 
simplify finding eigenvalues and eigenvectors of the Hamiltonian. 

It is also remarkable that the eigenvalues of L 2 and, for instance, L z can be found 
using only commutation relations given by Eqs. 3.53-3.55. The choice of the z- 
component here is a random historical occasion and does not have any physical 
significance. By choosing this particular component, which, you shall understand, 
is attached to a particular choice of the coordinate system, we essentially say 
to the experimentalists that if the quantum system is in a state described by the 
eigenvectors of L z as defined by this coordinate system, then a measurement of a 
component of the angular momentum in the same direction will produce results 
corresponding to the respective eigenvalue with certainty, while measurements of 
any other component of the angular momentum will have quantum uncertainty. 
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I begin the search for the eigenvalues by introducing abstract vectors |Al,A z ) 
defined as common eigenvectors of operators L 2 and L z characterized by some yet 
unknown eigenvalues Al and X z : 


L 2 \X l ,X z ) = X L \X L ,X Z ), (3.57) 

L z \X l ,X z ) = X Z \X L ,X Z ). (3.58) 

It is convenient to present these eigenvalues as Xl = fi 2 p and X z = fim. Pulling out 
factors fi 2 and fi from eigenvalues of L 2 and L z , respectively, makes the remaining 
quantities p and m dimensionless since the dimension of the angular momentum is 
the same as that of Planck’s constant. Apparently, I will need to invoke, somehow, 
two remaining components of the angular momentum. It is not right away obvious 
how to do it, but let’s say that I have had a divine intervention or premonition that 
the following two new operators might be useful: 

Li-1 _ = L x -\- iLy , (3.59) 

L- =L X - iL y . (3.60) 


The first thing I need to do with these operators is to compute their commutators 
with operators L 2 and L z : 


[L,l z 

[l-,l z 


\L x , + i ^Ly,L z J = —ifiLy — fiL x = —fiL +, 

L x , L z — i\ L y , L z = —ifiLy + fiL x = fiL-. 


(3.61) 

(3.62) 


It is also easy to see that commutators |^L 2 ,L±J vanish. Indeed, L 2 commutes with 

all component operators and, therefore, with L± , which are combinations of L x and 
L y . Now, the new operators for a theoretician are like new toys for a child, and I am 
eager to play with them and see what they can do. So, to satisfy the urge, and in 
hopes to learn something new, I want to apply operators L± to Eq. 3.57: 


L±L 2 \X L , X z ) = L 2 L± \X l , X z ) = ti 2 pL± \X L , X z ), (3.63) 

where I used L 2 L± = L±L 2 . OK, and what did we learn from this exercise? Well, I 
know now that if \Xl, X z ) is the eigenvector of L 2 with eigenvalue fi 2 p , then vector 
L± | A l, X z ) is still the eigenvector of L 2 with the same eigenvalue, which is not really 
surprising because L± do commute with L 2 . So far, it is not much, and you would 
be right to say that so far operators L± have not given us any particular advantages 
because we would have gotten the same result with operators L XJ . But let’s not jump 
the gun—always a bad idea—while patience and persistence are virtues. Instead, let 
me play another game and apply L+ to Eq. 3.58: 


L + L z \X L , X z ) = fimL+ \X L , X z ), 
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(l z L+ - hL +) \X L , X z ) = fimL+ \X L , X z ), 

L Z L+ \X L , X z ) =fi(m+l)L+ \X L , X z ), (3.64) 

where I used commutation relation 3.61 to make the transition from the first to the 
second line. Now, the last line in Eq. 3.64 tells us that L + \X L , X z ) is an eigenvector 
of L z with eigenvalue fim + fi. This is a quite exciting result: it means that if I start 
with some eigenvector with a known eigenvalue, I can generate new eigenvectors 
with progressively increasing eigenvalues: fim + fi , fim + 2 fi, fim + 3 fi... . This is 
already something new, which we could not have gotten without the operator L + . 
The secret of this operator lies in its commutator with L z , which is proportional to 
L+ itself. The same is true for operator L_, so it is worth looking into what this 
operator can do: 


L~L Z |At,, X z ) = fimL- \X L , X z ), 

(l z L- + \Xl, X z ) = funL- \Xl, X z ) , 

L Z L _ \X L , X z ) =fi(m-\)L. _ | X L , X z ) . (3.65) 

When deriving Eq. 3.65, I again applied commutator from Eq. 3.62 to its first line. 
The final result of this calculation indicates that operator L_ also generates new 
eigenvectors of L z but with progressively decreasing eigenvalues. Not surprisingly 
operators L+ and L_ are called raising and lowering ladder operators. 

Now, the question arises: will this process of generating new eigenvectors and 
eigenvalues ever stop? In other words, can operator L z have arbitrary large and 
arbitrary small eigenvalues? Intuitively, it is clear that the answer to this question 
must be negative and that the possible eigenvalues of L z must be limited both 
from above and from below. Indeed, these eigenvalues represent possible results 
of the measurement of one component of a vector, while eigenvalues of L 2 represent 
possible experimentally observable values of the squared magnitude of the same 
vector. It is difficult to imagine that the component of a vector can be larger than 
the magnitude of the same vector, and therefore one should expect that there must 
be some kind of a relation between these two eigenvalues, e.g., something like this 
m 2 < p. In order to see if such a relation, indeed, exists, consider the following 
expression: 


(X L , X z \L 2 \X L , X z ) = (X L , X Z \L 2 |A l , X z ) + (X L , X z \L 2 \X L , X z ) 
+ <A l , X z | L 2 |A l , X z ) . 


Taking into account Eqs. 3.57 and 3.58, this can be written as 

h 2 p = (X l ,X z \L 2 x \p,X z ) + (X L ,X Z \L)\p,X z ) + h 2 m 2 . 


(3.66) 
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Since the expectation values of operators L 2 and L 2 in any state are positive 
quantities, Eq. 3.66 yields that p > m 2 . This means that there exists the smallest m, 
which I will designate as 1, and there exists the largest m, for which I will use symbol 
/. Now, assume that you are dealing with the eigenvector \X L ,hl) and applying 
operator L_ to it. Generally speaking, this operator must lower the eigenvalue, but 
we assumed that this eigenvalue is already the lowest. The only way to reconcile 
Eq. 3.65 with this assumption is to require that 

L-\X L ,hl) = 0. (3.67) 

In order to figure out how to use this important piece of information, I again need a 
bit of divine inspiration, or I can just notice that the product of operators L+L_ can 
be expressed in terms of operators L 2 and L z : 

L+L— — L 2 + L 2 + iLyLx — iL x L y = L 2 — L 2 + fiL z . 


Rewriting this expression as 


L 2 = L 2 — hL z + L+L_, (3.68) 

and applying it to vector \\ L , hi) while taking into account Eq. 3.67, 1 obtain 

L 2 \X L , til) = L 2 \X L , hi) - hL z \X L , hi) + L+L_ \X L , hi) => 
h 2 p \X L , hi) = h 2 1 2 \X L , hi) - h 2 l \X L , hi) => 

p = l 2 -1. (3.69) 

Now, consider the state characterized by the largest values of m, | Xl, hi ). Attempting 
to act on this vector with operator L+ leaves you with the same conundrum 
encountered when discussing vector | XL,hl), but by now you know the way out: 
you must require that 


L+ |A L ,hl) = 0. (3.70) 

The derivation of Eq. 3.69 based on Eq. 3.67 was successful because the lowering 
operator L_ appears in this equation after operator L+. Consequently, when the 
product L+L_ is made to act on | Xl, hi), the resulting expression vanishes. In order 
to achieve the same effect with state \X l, hi) and Eq. 3.70, 1 need to modify Eq. 3.68 
in such a way that it would contain combination L_L+ instead of L+L_. To achieve 
this, consider 


L-L+ =L 2 X +L 2 y - iL y L x + il x Ly = L 2 - L 2 - tiL z 


(3.71) 
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which can be rewritten in the desired form 

L 2 = L 2 + fiL z + L-L+. (3.72) 

Now applying L 2 to |A l, hi) and using Eqs. 3.72 and 3.70, 1 get 

p = l 2 +1. (3.73) 

Comparing Eq. 3.69 with Eq. 3.73, 1 infer that smallest and largest eigenvalues of L z 
are related to each other as 


i 2 +1=i 2 ~ i 

It is easy to see (one can always just solve the quadratic equation for 7 ) that this 
relation implies that 7 = —/ or 7 = / + 1. The latter solution contradicts to 
the assumption that 7 is the smallest eigenvalue and l is the largest; thus the only 
possibility which makes sense is 7 = —/. 

Now imagine that you have found the smallest eigenvalue —l and you start 
applying operator L + to state \X L ,—til). After each application of the operator, the 
eigenvalue of L z increases by one, so that after applying it N times, you end up with 
eigenvalue —/ + N. Eventually you must reach the largest eigenvalue /, at which 
point you will have — / + N = / =>• 21 = N. N is apparently an integer number, so / 
can be either integer, if N is even, or half-integer, if N is odd. 

Now, let us gather our thoughts and try to summarize what it is that we have 
got: 

1. The eigenvalue of operator L 2 is equal to fi 2 l(l + 1), where / determines the 
maximum eigenvalue of the operator L z , til. 

2 . / can take either integer or half-integer values, forming two non-overlapping 
series of allowed values: 0,1,2, 3 • • • or 1/2, 3/2, 5/2 

3. Allowed values of m start at —l and advance increasing by one until it reaches /. 
For instance, for / = 0, the only possible value of m is zero; for l — \/2,m can 
be —1/2, 1/2; and for l = 1, we can have states with m = —1,0,1. In general 
for a state characterized by the same eigenvalue of operator L 2 ,fi 2 l(l + 1), there 
are 21 + 1 possible states with different eigenvalues of L z . 

It is interesting to note that if I were talking about a classical vector, the maximum 
magnitude of its component along an axis would simply equal to the length of the 
vector. If we interpret expression til as such a component’s length, then the squared 
length of the entire vector would have been ti 2 l 2 , which is different from the quantum 
result ti 2 l 2 + ti 2 l. One can see that the “extra” contribution to the “length” comes 
from fluctuations of two other components of the angular momentum. Indeed, using 
what you have learned from Eq. 3.66, you can write 
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fi 2 l 2 + fi 2 l = h 2 l 2 + (Z, Z| L 2 X IZ, /} + (Z, Z| Z, 2 |Z, Z> =>■ 

(Z,Z|L 2 |Z,Z) + (Z,Z|L 2 |Z,Z> = Zi 2 Z=^ 

(Z,Z|L 2 |Z,Z) = (Z,Z|L 2 |Z,Z) = fi 2 l/2. 

In the last expression, I introduced a shortcut notation for the common eigenvectors 
of operators L 2 and L z , which in general looks like |/, m) with the first number 
indicating that this vector belongs to the eigenvalue fi 2 l(l + 1) of L 2 and the second 
number pointing at the eigenvalue fim of L z . For brevity, l is often referred to as 
the “angular momentum,” and m is often called a “magnetic” quantum number. The 
origin of this name will become clear later, when we get to consider the behavior of 
atoms in the magnetic field. 

Finally, let me note that even though we know now that ladder operators L± 
generate eigenvectors of L z , there is no guarantee that the resulting eigenvectors 
will be normalized even if the initial vector is. So, in order to finalize the rule for 
obtaining normalized eigenvectors using ladder operators, we have to analyze their 
action more carefully. First, it is easy to see that they are Hermitian conjugates of 
each other: 


L- = V + . (3.74) 

Assuming that vectors |/, m) and \l, m+ 1) are normalized and introducing yet 
unknown normalization coefficient, I can write 


L + \l,m) = A Um |/,m + 1). 


The Hermitian conjugation of this expression yields 

(/, m\ L- = (/, m + 1| A* m . 

Multiplying the left-hand side of this equation by the left-hand side of the previous 
one and doing the same to their right-hand sides yields 


(/, m\ L-L+ | l,m) = (/, m + 1| |/, m + 1). 


Since it was assumed that all ket vectors are normalized, I now immediately have 
for \Ai m \ 2 : 


\Ai m \ 2 = {l,m\L-L+ | l,m) . 

Taking into account Eq. 3.71, and the fact that kets in this expression are eigenvec¬ 
tors of L 2 and L z , I find 


|A / m | 2 — fi 2 [l (/ + 1 ) — m (m + 1 )] 
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which allows to establish the final rule for the generation of new eigenvectors from 
the known ones: 


L+\l,m) =tiy/Hl+\)-m(m+\)\l,m+\). (3.75) 

I will leave it to you to show that 

L- \l, m) = ti s /l(l+\)-m(rn-\)\l,m-\). (3.76) 

To conclude this section, let me just emphasize once again that we were able to 
find eigenvalues for the system of operators, as well as a rule for generating their 
eigenvectors, using nothing but their commutation relations. The key to successful 
completion of this task was the existence of the ladder operators with their very 
special commutation relations given by Eqs. 3.61 and 3.62. 


3.3.5 Statistical Interpretation 

In Chap. 2 I have already introduced the relation between coefficients in the 
superposition states and probabilities of various outcomes of the measurements on 
quantum systems. This time I will elaborate those ideas in a more precise way by 
formulating two postulates introducing statistical interpretation to the formalism of 
quantum mechanics. 

Postulate 4 (Born’s Rule) A measurement of an observable can only yield a 
value from the set of the eigenvalues of the operator representing the measured 
observable. If a system before the measurement is not in a state described by 
one of the eigenvectors of this operator, the result of the measurement cannot 
be predicted a priori. Only a probability (or probability density for observables 
with continuous spectrum) of a particular outcome can be known. If the measured 
eigenvalue is not degenerate, this probability is given by 

Pn = I (« I A„)| 2 , (3.77) 

where \a) represents a state of the system before the measurement, X n is one of 
the eigenvalues, and \X n ) is the corresponding eigenvector. If the eigenvalue is 
degenerate, the probabilities given by Eq. 3.77 must be summed up with other 
degenerate states belonging to this eigenvalue. In the case of observables with 
continuous spectrum, the probability is replaced with probability density p(q ): 

p{q) = | (a | q) | 2 , 

which determines a differential probability dP that the measured value of the 
observable lies within interval of values [q, q + dq\ as dp = p(q)dq. 
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Postulate 5 Regardless of the state in which the system was before an observable 
is measured, immediately after the measurement, the system will be in a state 
represented by the eigenvector of the corresponding operator belonging to the 
observed non-degenerate eigenvalue. If the measured eigenvalue is degenerate, 
all we can state is that after the measurement the system will be in a state in the 
subspace of eigenvectors belonging to this eigenvalue. 


Both these postulates are essentially more accurate restatements of the proposi¬ 
tions already discussed in Sect. 2.2.3, where somewhat vague notion of “the state 
with definite values of an observable” is replaced with its mathematical representa¬ 
tion as an eigenvector of a respective operator. This more formal approach allows 
carrying out a more comprehensive exploration of the statistical interpretation of 
quantum mechanical formalism. 

I begin by considering an expression of the form (a\T\a), where |a) is an 
arbitrary state and T is a Hermitian operator representing a certain observable. I have 
already mentioned that this expression is often referred to as “expectation value,” but 
now I can demonstrate what it actually means. Expanding this state into eigenvectors 
of T (Eq. 3.39), I can present (a \ T \a) as 

(aI T\a) = J2'H a * a ' n (X n \T\Xm) = 

n m 

^ ^ ^ ^ Xm^ n {Xn \ Xm) — ^ ' A n \^n | > (3.78) 


where I first took advantage of the fact that | Xm) is an eigenvector of T with 
eigenvalue A m : T |/ m ) = X m \ x m ) and then used orthonormalization condition for 
the eigenvectors, ( Xn\ Xm) = 8 nm - According to Born’s rule, \a n \ 2 is the probability 
that the measurement of the observable will produce X n . Then, it becomes clear that 
the final result in Eq. 3.78 has the meaning of the average value of the observable, 
which one would “expect” to find if the same measurement is repeated multiple 
times or if an experimentalist carries out the measurement on multiple identical 
copies of the same system. The simplest measure of the statistical uncertainty of 
such measurements would be the standard deviation, which in regular probability 
theory would be defined as 


a T = Va 2 -A 2 

where the bar above the letters means statistical averaging with probabilities given 

by p n = \a n \ 2 : A 2 = £ n /? n A*; A = (J2 n Pn A«) 2 • In the context of quantum theory, 
the measure of uncertainty of a measurement can be described as 


a r = |f>)-(<a|f|a)) 2 . 


( 3 . 79 ) 
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Indeed, 


(a| T 2 |a) = J2J2 a * a m(Xn\Tf\xm) = (x n \ T |/ m > 

n m n m 

= J2'H^n a n a m{Xn\ Xm) = • 

n m n 


This shows that the measure of uncertainty expressed by Eq. 3.79 does agree with 
the probabilistic definition of the standard deviation. If state |a) is one of the 
eigenvectors |/« 0 ), all coefficients a n are zeroes, with the exception of a no = 1. 

In this case, we have (a \ T 2 la) = A^ o = ^(a| T |a)^ , and uncertainty Gj vanishes. 
This justifies calling states represented by eigenvectors determinant states or states 
in which the observable has a definite value. If there are several mutually consistent 
observables represented by commuting operators, we can have a state, which is a 
common eigenvector of all operators, in which all observables will have definite 
values. 

If two observables are not mutually consistent and are described by operators T\ 
and T 2 that do not commute, one can derive the following inequality for uncertainties 
of these operators g Ti and g Ti : 


1 

G T\G Tl > - 



(3.80) 


which is valid for an arbitrary state |a). This is the so-called generalized uncertainty 
principle. Using canonical commutation relations 3.47, 1 can immediately reproduce 
the Heisenberg inequality 


(3.81) 

which now becomes a particular case of a more general result presented by 
Eq. 3.80. It is interesting that using Heisenberg uncertainty principle, Eq. 1.4, as 
an empiric formula and combining it with Eq. 3.80, 1 can “derive” or justify, if you 
want, the canonical commutator between the coordinate and momentum operators. 
Indeed, since Eq. 3.81 is valid for an arbitrary state, in order to reconcile Eq. 3.81 
with Eq. 3.80, I have to admit that the commutator of coordinate and momentum 
operators must be a regular number (only in this case the right-hand side of Eq. 3.81 
becomes proportional to (a| a) = 1, so that the dependence on the state vanishes). 
The absolute value of this number must obviously be equal to fi , but recalling that if 
the commutator of two Hermitian operators is a number, it must be an imaginary 
number (see Eq. 3.31), I can conclude that [x,p x ] = ifi , which is the canonical 
commutation relation given in Eq. 3.47. Of course, these arguments are not sufficient 
to show if this commutator is +ifi or —iti, but the choice of the sign is, actually, the 
matter of convention, and the standard agreement is to write this commutator as 
given in Eq. 3.47. 
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To illustrate all these rather abstract postulates, I will finish this section with an 
example, in which, to save time, I will again use matrices Ml and M2 defined by 
Eq. 3.37. 

Example 14 (Probabilities of Measurements) Assume that these matrices represent 
two observables of some quantum system and that you intend to measure these 
observables. It is given that the system is prepared in the state | rj) represented by the 
column 


\n) 


l 

Vf 


2 i 

1 

1 — i 


and you are asked to predict the results of the different sequence of measurements 
of observables Ml and M2. The first step you have to do is to verify that your 
initial state is normalized, which is just a good housekeeping habit. The norm of 
this vector is (do not forget to do complex conjugation when converting ket into a 
bra—for some reason even good students keep forgetting about it) 


M = ^ [- 2 * 1 1 +/] 

\ ((-20 (20 + 1 + (1 + 0 (1 - 0 ) = 

1 ( 4+1 + 2 ) = 1 . 

Once normalization is verified, you are ready for the next step. Let’s say you 
first want to measure the observable represented by M2. We found earlier that the 
eigenvalues of this matrix are X\ = 2 , A 2 = 1, and A 3 = — 1 . Thus, these are 
the values that you can expect to see on the dial of your measuring device (more 
or less, experimental errors are unavoidable, of course). The actual issue is to find 
the corresponding probabilities. Using Bom’s rule, Eq. 3.77, and the corresponding 
eigenvectors given in Eq. 3.38, you can find for each of the eigenvalues 


2 i 

1 

1 -i 




~-il 

w, = M 2>| 2 = 


0 

_ 1 J 


1 , l2 5 

— | 2 ;+ 1 +*| 
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and 


PX2 = M -1)| 2 = ^d=[-2il 1+i] 


1 

V2 

1 


1 '-2i + V2 + 1 + i' 2 


28 


(l + + 1 _ 2 + V2 


28 


14 


M 3 = IM 1>| 2= |^[-2il 1+i] 


1 

-V2 

1 


1 l r | 2 f 1 _ + 1 2- V2 

2 / - V2 + 1 + / = 2 7 


28 


It is always a good idea to run a quick check: 


28 


14 


5 2+V2 2-V2 5 2 

PX\ + P \ 2 + Px 3 — ^ “t-77- 1 -77— — - + - — 1 , 


14 


14 


7 7 


as it should be. So far so good. The expectation value of M2 can be computed in two 
different ways. First, I will use the standard probabilistic definition of the average 


(M2) = (p X] Ai + px 2 ^2 + p Xl h) = 

7 v ’ 14 14 7 

And I will also compute this quantity using quantum-mechanical definition: 

(M2) = (r]\M2\ri) 


I 


[-2 i 11 + /] 


1 


-1 


1 

V2 

-J— 0 _ L 

+2 +2 

“I ~4= 1 

V2 



2 i 


1 


_1 — /_ 


- [—2/ 1 1+/] 


" a -75- 1 + i " 

2 i l—i 

+2 V2 

-2/ - ^ + 1 - /_ 
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1 

7 


[—2 i 


1 1 + /] 


3l 1 
_!±i 

V2 

7 3i ~72 + l . 


l 

7 


' 2 / 

6 H- — + 2/ — 

v V2 


1 + i 

■vf 


1 i \ 

2/ - — + 4 - — 

V2 V27 


1 

7 


( 10 - V5), 


again, exactly as promised. If immediately after measuring Ml you will attempt to 
measure Ml and are interested in probabilities of various outcomes (now you are 
talking about outcomes consisting of pairs of measurements, which are given by all 
nine possible pairs of eigenvalues (A- M2 \Aj M2) )), you have to take into account 
that after the first measurement, the system is no longer in the initial state | rj). 
Depending on the outcome of the first measurement, it will be in a state presented 
by one of the eigenvectors of M2. However, since these two matrices commute, and 
the eigenvectors of Ml are also eigenvectors of M2, the outcomes of the second 
measurement are completely determined by the outcome of the first, and there are 
only three possible results. For instance, if the first measurement produced for M2 
value —1 (probability (2 — \/2)/14), the measurement of Ml will be guaranteed 
to yield 2 (the state corresponding to eigenvalue — 1 of matrix M2 is described by 
the same vector as the eigenvector of Ml belonging to its eigenvalue 2). Thus, the 
probability of getting the pair (—1,2) is still (2 — \?2)/14. 

If you measure Ml first, the situation is a bit more complex since Ml has degener¬ 
ate eigenvalues. So, if you want, for instance, to find the probability of getting 1 after 
measuring Ml, you have to compute two probabilities—one for each degenerate 
state—and sum them up. To do that you can use the corresponding orthogonal and 
normalized vectors given in Eq. 3.38, which are common eigenvectors of both Ml 
and M2. This will yield 


Pi = 


7!7f [_2nl + l] 


- 1 ' 

0 

1 


1 1 

2 V7 


[-2 i 11 + /] 


1 

->/2 

1 


_ 10 ( 4 — l\fl _ 12- Vl 
~ 14 + 28 ” 14 ' 


At this point a question might pop up in your head, if this result is unique. Indeed, 
you already know that degenerate eigenvalues can be characterized by an infinite 
number of different normalized and orthogonal eigenvectors. It would be nice if the 
probability would not depend on this arbitrary choice, but is it really so? I will give 
you a chance to answer this question as an exercise. 
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Finally let me compute the uncertainty of the observable M2 in this experiment. 
For this computation I need to first find M2 2 , which is 


M2 2 


0 1 0 



5 

2 -I 


Now you can compute 


{v\M2 1 \rj) = — \—2i 1 1 + /] 


r 5 o —-1 
2 2 


2 i 

0 1 0 


1 

— - 0 ^ 

L 2 u 2 J 


_ 1 — i_ 


1 

7 


[-2 i 1 1 + i] 


r_ 3 
2 


13;-| 


2 



17 

y 


so that the uncertainty cr^ 2 is found to be 


a m = (V\M2- 1 1 r,) - (t}\M2 | rj) 2 = y - 1 (lO - V2) 2 = 17 


3.4 Problems 

Section 3.1 

Problem 10 A constant force F is acting on a particle of mass m. Derive an 
expression for the potential energy associated with this force, write down the 
Hamiltonian of the system, and derive Hamiltonian equations. 

Problem 11 Consider a particle moving in a central potential field with Hamilto¬ 
nian 


H=?- + V(\r\). 
2m 


Compute the following Poisson bracket: 


{L X ,H}, {Ly,H}, {L Z ,H}, 


where L XJ , Z are Cartesian coordinates of angular momentum of the particle in some 
arbitrarily chosen coordinate system. Interpret the results. 
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Section 3.2.1 

Problem 12 Which of the following is a linear operator? 

1. Inversion operator P, which acts on functions of coordinates according to the rule 
h(r) =f(~r). 

2. Square operator S defined as Sf = f 2 . 

3. Determinant operator Det, which when applied to a square matrix turns it into 
the matrix’s determinant. 

4. Exchange operator E acting on functions of two variables as Ef(x \, X2) = 

f(x 2 ,X 1). _ 

5. Trace operator 7r, which acts on a matrix and turns it into the sum of its diagonal 
elements. 

Problem 13 Prove the linearity of the rotation operator. 

Problem 14 Find a Hermitian conjugate for the integral operator K acting on 
integrable functions of a single variable and defined by kernel K (. x\, X 2 ): 

00 

Kf = j K(x u x 2 )f(x 2 ). 

—OO 


The inner product is defined in a regular way: (g\f) = g* (x)f(x)dx. Determine 
under which condition on the kernel this operator is Hermitian. 

Problem 15 Expression P = \a) (/3\ can be understood as an operator acting in the 
following way: 

p\o) = i«> mo). 

Find its Hermitian conjugate. 


Section 3.2.2 


Problem 16 Specify the condition that must be obeyed by an operator so that it is 
both unitary and Hermitian. 

Consider the following matrices: 


"1 0 " 


"0 f 


0 i 

0 -1 

’ 

1 0 

’ 

—i 0 


Do they satisfy this condition? 
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Problem 17 For three operators A, B , and C, prove the following identity (known 
as Jacobi identity): 

[[a,b] ,c] + [[c,A] .e] + [[b,c] ,a] = o. 

Problem 18 Which of the following matrices are Hermitian? 

1 . 


3 i 5i 7 
-5i 2 3 
7 3 0 


1 i 2 i 
-i 0 3 
-2 i 3 2 


3. 


"V2 1 -2 " 

-1 2 4^5 

_ 7 -4^5 V3 _ 
4. 


"7 4 2 " 
42 1 
2 1 -4 


Problem 19 Prove the identity 


(as) =B~ l A~\ 

Problem 20 Prove the following properties of the commutators: 

Tu%] =-[t 2 ,Ti] 

[r, + t 2 , , r 3 ] = + [f 2> r 3 ] 

[c 4 ,, C 2 r 2 ] = C1C2 [ri.T’z]. 
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Problem 21 If operator D is defined as 


Df(x) = 


dj_ 
dx ’ 


what would be an inverse of this operator? 

Problem 22 Find an inverse of the following matrices: 

1 . 


1 i 2 i 
-i 0 3 
_—2i 3 2_ 

2 . 

'0 i 2 " 

-i 0 1 
_—i i 0 _ 

Problem 23 Consider an operator a characterized by the following property: a 2 = 
/, where I is a unity operator. Using power series expansion, find the closed-form 
expression (not in the form of a series) for the operator exp ( i&t ). 

Problem 24 Prove that if the commutator of two Hermitian operators is a number, 
this number is necessarily imaginary. 

Problem 25 Given that [x,p] = ifi , compute 

[^ 2 ]. 


Section 3.2.3 



0 i 


"o l" 

Problem 26 Consider matrices 

-i 0 

and 

1 0 


1. Find the eigenvalues and normalized eigenvectors of these matrices. 

2. Check orthogonality of the found vectors. 

Problem 27 Consider two matrices: 



”10 0 ' 


'1 0 0 " 

Ai = 

0-10 

; a 2 = 

00 1 


0 0-1 


0 1 0 
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1. Show that these operators commute. 

2. Find a set of eigenvectors common for both of them. 

Problem 28 Find eigenvalues and normalized eigenvectors of the following matrix: 


1 

l 

"V2 

-1 


1 

V2 

0 

1 

V2 


-1 

1 

V2 

1 


Problem 29 Consider the following matrix: 


A = 


0 0-1 
0 1 0 
-10 0 


1. Find its eigenvalues. Are there degenerate ones? 

2. Construct a system of normalized and orthogonal eigenvectors. 

3. Show that 


e xA = coshx + A sinhv. 


Section 3.3.1 

Problem 30 Consider an operator defined as 

^ = 101) (011 + 102) (02 | + |03) (03 | - 

i 101) (021 - 101) (03 I + i 102) (011 - 103) (011 

where |0i), |02), and \<p$) form an orthonormalized basis. 

1. Check if this operator is Hermitian by computing A^. 

2. Compute A 2 . 

3. What are the possible values an experimentalist can observe when measuring an 
observable represented by this operator? 

4. Find states in which the system will be immediately after the measurement for 
each of the possible outcomes. Verify that the states are presented by orthogonal 
vectors. 

Problem 31 Show that if P is a projection operator, I — P is also a projection 

operator. 
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Section 3.3.2 


Problem 32 Derive the commutation relations 

|~ L z , L x J = ifiLy 

^Ly, j = itiL x . 


Problem 33 Prove that the commutator of the operator of the square of angular 
momentum L 2 commutes with all components of the angular momentum operator, 

Lx,y,z- 

Problem 34 Compute commutators 


[l z ,x\, [L,v] , [l z ,z] 
[U,] - \Lz>Py\ , [ 4 . a ] 

[L,p x ]. [ 4 . 4 ]. [ 4 . 4 ] 


Problem 35 Prove that 


x,y,z,P 2 ] = 


T ^ ^2 

L ,p 


= 0. 


Section 3.3.4 

Problem 36 Prove that 

L- \l, m) = tiy/W+ 1 ) — m (m — 1 ) \l, m — 1 ) . 
Problem 37 Compute the following expressions: 

(/, m'\ L- | /, m) 

(/, m r | L + |/, m) . 

For l = 1 present the results as a matrix. 
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Problem 38 Compute 

(/, m'\ L 2 x |/, m). 


Hint: Use the representation of L x in terms of raising and lowering ladder operators. 


Section 3.3.5 

Problem 39 An observable A represented by an operator A can be in two mutually 
exclusive states represented by eigenvectors of A \a\) and fa), where a\ 2 are 
corresponding eigenvalues. The second observable B represented by an operator 
B also can be in two mutually exclusive states represented by eigenvectors of B 
\b ]) and \b 2 ), where b\ 2 are corresponding eigenvalues. These eigenvectors can be 
related to each other as 


l«i> = |(3|*i> +4|*2» 

\a 2 ) = 1(4 \bi) -3i\b 2 )). 

1. If observable A is measured and value a\ is obtained, what is the state of the 
system immediately after the measurement? 

2. If now B is measured, what are the possible outcomes, and what are their 
probabilities? 

3. Right after B was measured, A is measured again. What is the probability of 
getting a\ for different possible outcomes of the first measurement? 

Problem 40 A quantum system is in a state described by a vector 

'“ i>= 

Find the probability that a measurement of some observable will bring the system 
to state described by a vector 

|0!2> = ~7f l/l} + 71 l/2> + 71 l/3) 

where |^ 1 , 2 , 3 ) form an orthonormalized basis. 

Problem 41 Consider a quantum system in a state described by a column vector 

—i 
2 . 

0 
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The system is characterized by two observables T\ and T 2 presented by matrices 



' i i r 


3 0 0 

Tx = 

-i 0 0 

; t 2 = 

0 1 i 


1 00 


o 

7 

o 


1. If T\ is measured first and T 2 immediately afterward, what is the probability of 
obtaining — 1 for T\ and 3 for T 2 1 

2. What are the probabilities of getting the same values if the order of measurements 
is reversed? Discuss the result in terms of commutation properties of the two 
matrices. 

Problem 42 Consider a system described by the Hamiltonian 

0 -i O' 
i 3 3 
0 3 0_ 

placed in a quantum state described by a column vector 



W) 


4-i 
-2 + 5i 
.3 + 2 i m 


1. Find the expectation value of energy in this state. 

2. Find the uncertainty of energy in this state. 

3. Find the possible values of energy measurements and their probabilities. 

4. Use the results of the previous task to calculate the expectation value and 
uncertainty of energy again. Compare the results with results of tasks 1 and 2. 

Problem 43 Go back to Example 14 at the end of the chapter, and using a different 
set of orthogonal and normalized eigenvectors of Ml (you will have to find it first, 
of course), compute the probability of getting the degenerate eigenvalue of Ml. Is 
the result the same? 

Problem 44 Consider a system described by a Hamiltonian 


Id 2 1 7 

H = -- + -x 2 

2 dx 2 2 


presented by an operator acting on square-integrable functions of a single variable 
v forming a Hilbert space with an inner product defined in Sect. 2.1. This system is 
prepared in state 
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where vectors \(j)\^) are defined as the following functions: 

\fa) = ex P ( _ y) ; 1^4 = 0 _ 2 * 2 ) ex P (-y) • 

1. Verify that these functions are eigenvectors of the Hamiltonian, determine the 
respective eigenvalues, and normalize the eigenvectors. 

2. Rewrite the expression for the state \\l/) in terms of normalized versions of the 
vectors | 0 i, 2 )- 

3. If the energy of the system is measured, what are the possible outcomes, and 
what are their probabilities? 

4. Find expectation values and uncertainties of the operators 

df 

nf(x) = -i-f- ; Xf(x) = xf(x) 
dx 

in state | \/f). 



Chapter 4 

Unitary Operators and Quantum 
Dynamics 


Check for 
updates 


In the previous section, I explained how one can dig out experimentally relevant 
information using states of a quantum system and operators representing the observ¬ 
ables. The remaining burning question, however, is how can we find these states 
so that we could use these methods. In a typical experiment, an experimentalist 
begins by “preparing” a quantum system in some state, which they believe they 
know. 1 After that they smash the system with a hammer, or hit it by a laser light, 
or subject it to an electric or magnetic field, wait for some time, and measure 
new values of the selected observables. In order to predict the results of new 
measurements, you must be able to describe how the quantum system changes 
between the time of preparation and the time of subsequent measurement, or, 
speaking more scientifically, you must know its dynamics. As it has been made clear 
in the previous section, you need two objects to predict the results of a measurement: 
a state of the system and the operator assigned to the measured observable. Now 
you can ask an interesting question: “When the quantum system evolves in time, 
what is actually changing—the state or the operator?” To make this question more 
specific, consider an expectation value of an observable described by operator T : 
(a\T\a). When your system evolves, this expectation value becomes a function of 
time. The question is, which element of the expression for the expectation value, T 
or \a), must be considered as a time-dependent quantity to describe the dynamics 
of the expectation value? It turns out that time dependence can be ascribed to either 
of these two elements, and depending on the choice, it will generate two different 
but equivalent pictures of quantum mechanics. In the so-called Schrodinger picture, 
the state vectors are treated as time-dependent quantities, while operators remain 
fixed rules transforming the states. In the Heisenberg picture, the state vector is 
considered as a constant, and all the dynamics of the system is ascribed to the time- 
dependent operators. The origins of these two pictures can be found in the earlier 


Preparation of a quantum system in a predefined state usually consists in carrying out a 
measurement, but it is not an easy task to prepare a system in a state we want. 


© Springer International Publishing AG, part of Springer Nature 2018 
L.I. Deych, Advanced Undergraduate Quantum Mechanics , 
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days of quantum theory with the Heisenberg matrix mechanics competing against 
Schrodinger’s matter wave theory. The first attempt to prove equivalence of the two 
pictures was undertaken by Schrodinger as early as in 1926, but the rigorous math¬ 
ematical proof of the equivalence did not exist until John von Neumann published 
in 1932 his definitive book Mathematical Foundations of Quantum Mechanics. 

Von Neumann was one of the major figures in mathematics and mathematical 
physics of the twentieth century. Born to a rich Jewish family in Hungary, which 
was elevated to nobility by Austro-Hungary Emperor Franz Joseph (hence the prefix 
von in his name), he was a child prodigy, got his Ph.D. in mathematics at the age 
of 23, and became the youngest privatdocent at the University of Berlin. In 1929 he 
got an offer from Princeton University and moved to the USA. He brought his entire 
family to America in 1938 saving them from certain death. In addition to laying 
rigorous mathematical foundation to quantum theory, von Neumann is famous for 
his role in the Manhattan Project and developing the concept of digital computers 
(among other things). 

After this brief historical detour, I begin presentation of quantum dynamics 
starting with the Schrodinger picture. 


4.1 Schrodinger Picture 

4.1.1 Time-Evolution Operator and Schrodinger Equation 

The statistical interpretation of quantum mechanical formalism makes sense only 
if all vectors describing states of quantum system remain normalized at all times. I 
will begin digging deeper into this issue by computing the norm of a generic vector 
||a || using Eq. 3.39. First, I need the corresponding bra vector: 

(«i = E a *<*»i 

n 


so that I can write for the norm 

mi 2 = («i a) = EE a «- a * <*»i xm) = EE «■»<*»" = E^i 2 - 

m n m n n 

(4.1) 

According to the postulate 4 in Sect. 3.3.5, \a n \ 2 is equal to probability p n that the 
respective eigenvalue will be observed. Equation 4.1 in this case can be interpreted 
as a statement that the norm of a generic vector is equal to the sum of probabilities 
of all possible measurement outcomes. The latter must obviously be equal to unity 
Y2 n Pn = 1 regardless of the time dependence of state |a). This result has quite a 
profound consequence. Indeed, time dependence of a state vector can be considered 
as a transformation of a vector \a (to)) defined at some initial instant of time t 0 into 
another vector \a (t)) at time t under the action of an operator: 
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\a(t)) = U(t,t 0 )\ct(to)). (4.2) 

In order to keep the norm of the vector unchanged, the operator U (t , to) must be 
unitary, which significantly limits the class of operators that can be used to describe 
the dynamics of quantum states. It also must obey an obvious condition: 

U (to, to) = I. (4.3) 

Now, consider an evolution of the system from state \a (to)) to state \a (t\)) and then 
to state | a (tf)), which can be described as 

I a (h)) = U (ti,to) I ot (to)) 

\ a ( t f))= U{t f ,h)\<x(tO) ■ 

I can also describe a system’s dynamics from the initial state to the final, bypassing 
the intermediate state: 


\ a (*/)) ~ ^ (*/> *o) W (fo)) • 

Comparing this with the first two lines of the previous equation, you can infer an 
important property of the time-evolution operator U : 


U ( tf , t 0 ) = U ( tf , t\) U (t \, f 0 ) • 


(4.4) 


An important corollary of Eq. 4.4 is obtained by setting tf = to, which yields 


U(to, to U (h, to) = / =► U (to, to = u~ l (h, to) (4.5) 

where I also used Eq. 4.3. In other words, the reversal of time in quantum dynamics 
is equivalent to replacing the time-evolution operator with its inverse. This idea 
can also be expressed by saying that by inverting the time-evolution operator, you 
describe the evolution of the system from present to the past. This property can also 
be described as reversibility of quantum dynamics: taking a system from to to tf and 
back brings the system in its original state completely reversing its initial evolution. 

Now, let me consider the action of U (t \, to) over an infinitesimally small time 
interval to,t\ = to + dt. Expanding this operator over the small interval dt and using 
Eq. 4.3, 1 can write: 


U (to T dt , fg) — / T" Gdt 


(4.6) 


where G = dU/dt 


is an operator obtained by differentiating the time-evolution 

t=to 


operator with respect to time. Inverse to the operator defined by Eq. 4.6 can be found 
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by expanding function (1 + x) 1 with respect to v and keeping only linear in v terms: 
(1 + x)~ l ~ 1 — x. Applying this to operator + Gd?j , I get 

U ^ (to + dt, t 0 ) — I — Gdt. 

At the same time, Hermitian conjugation of Eq. 4.6 returns 

{to + dt, to) = I -h G^dt. 

Since the time-evolution operator is unitary ( U~ l = W), operator G has to be anti- 
Hermitian: Cfl = —G, so that it can be presented as G = —iH/fi (see Eq. 3.30), 
where H is a Hermitian operator and fi is introduced to ensure that H has the 
dimension of energy. Indeed, since the time-evolution operator is dimensionless, 
it is clear that operator G has the dimension of inverse time. The dimension of the 
Planck’s constant is that of energy multiplied by time, so it is clear that H has indeed 
the dimension of energy. This simple analysis leads the way to the next postulate of 
quantum theory. 

Postulate 6 Hermitian operator H in the expansion of the time-evolution operator I 
is the operator version of Hamiltonian function of classical mechanics. 1 


Thus, Eq. 4.6 can now be rewritten as 


U (to + dt, to) =1 — i—dt. 

n 

Taking advantage of the composition rule, Eq. 4.4, 1 can write: 

U (t 4“ dt, to) — U (t 4" dt, t) JJ ( t, to) — 

( / — i—dt ] U (t, to) = U ( t, to) — i—U ( t, to) dt 


(4.7) 


where I also used Eq. 4.7. The main difference between this last expression and 
Eq. 4.7 is that t in the latter can be separated from to by a finite interval. The last 
equation can be rewritten in the form of differential equation: 


JUtf.h, H T 

Applying Eq. 4.7 to Eq. 4.2, 1 can also derive: 

|a 0 1 + dt)) - \a (t )) 


(4.8) 


\ct (t 4" dt)) 


=H™') 


a(t)) 


dt 


= --H\a(t)) 
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which can be rewritten in a standard form 

ih^t=H\a) (4.9) 

at 

called Schrodinger equation. As any differential equation, Eq. 4.9 has to be com¬ 
plemented by an initial condition specifying the state of the system at an arbitrary 
chosen initial time. Given the role of Hamiltonian in classical mechanics discussed 
in Sect. 3.1, it is not very surprising that the same quantity (in its operator 
reincarnation) determines the dynamics of quantum systems as well. 


4.1.2 Stationary States 


If Hamiltonian does not contain explicit time dependence, which might appear, for 
instance, if an atom interacts with a time-dependent electric field E(t ), Eq. 4.9 has a 
very simple formal solution: 


HO) = exp l a °) ’ ( 4 -!0) 

where H) is the state of the system at time t = 0. For practical calculations, 
however, this solution is not very helpful because the action of the exponent of 
an operator on an arbitrary vector in general is not easy to compute. Situation 
becomes much simpler if the initial state is presented by one of the eigenvectors of 
the Hamiltonian. If |ofo) = \Xn), where |/„) is an eigenvector of H with respective 
eigenvalue E n 

H\Xn) =E n \Xn), (4.11) 

the right-hand side of Eq. 4.10 can be computed as follows: 


HO) = exp 






\Xn) > 


(4.12) 


where I used the definition of the exponential function of an operator, Eq. 3.21, and 
the fact that 


H m \Xn)= E:\Xn), 


which is easily proved. (You will have a chance to prove it when doing your 
homework.) Thus, if a system is initially in a state represented by an eigenvector of 
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the Hamiltonian, it remains in this state forever and ever. The time-dependent factor 
in this case is a complex number with absolute value equal to unity (pure phase 
as physicists like to say) and does not affect, therefore, any measurable quantities. 
Indeed, consider, for instance, an expectation value of some generic operator T when 
a system is in the state described by Eq. 4.12: 

{a(t)\T\a(t}) = expf yA (*„| T\x n ) exp (-/yd = {Xn\T\x n )- 

The eigenvector equation 4.11 is often called time-independent Schrodinger 
equation, and it’s solutions represent the very same stationary states, which were 
postulated by Bohr, whose existence was proven in Davisson-Germer experiments 
mentioned in the Introduction, and which became the main object of the Schrodinger 
wave mechanics. The corresponding eigenvalues are called energy levels or simply 
energies. Here I will use the term stationary states to designate solutions of the 
time-independent Schrodinger equation with exponential time dependence attached 
to the eigenvectors and given by Eq. 4.12. As you just saw, this time dependence 
does not affect experimentally observable quantities, which remain independent of 
time, justifying the name “stationary” for these states. 

I need to point out at a general ambiguity of relation between quantum states 
and vectors representing them: the latter are always defined with accuracy to a 
phase, meaning that all vectors can be multiplied by a complex number of unit 
magnitude without affecting any physical results. This is obvious from Eq. 4.10, 
which does not change upon multiplying the state vector by any constant factor. 
But since it is required that the states are normalized, this constant factor is limited 
to have a magnitude equal to unity, i.e., to be a pure phase. However, in the case 
presented in Eq. 4.12, the multiplying factor is not constant and, therefore, cannot 
be simply dismissed making it physically significant. This significance manifests 
itself, however, only when we have to deal with several stationary states. Indeed, 
since energy is always defined with an accuracy up to a constant factor, one can 
always make the energy eigenvalue corresponding to any one of stationary states 
to vanish killing thereby the time dependence of the corresponding stationary state. 
This vanishing trick, however, can be achieved only for one state, while all others 
will retain their exponential factor albeit with different energy values equal to the 
difference between their initial values and the one you chose to be equal to zero. 2 
In order to demonstrate that this general property of energies retain its meaning in 
quantum theory as well, I will consider a state evolving from a superposition of 
two eigenvectors of a Hamiltonian with different eigenvalues. Thus, assume that the 
initial state of the system is 


N) =ai |/i) +a 2 \xi). 


technically this can be achieved by subtracting one of the energy eigenvalues from the potential 
appearing in the Hamiltonian, which is equivalent to simple change of the zero level of the energies. 
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Using linearity of the time-evolution operator and results from Eq. 4.12, 1 can easily 
compute: 



(4.13) 


Equation 4.13 shows that an initial vector in the form of a superposition of two 
eigenvectors of a Hamiltonian evolves by “dressing up” each of the initial state with 
the exponential time factor containing the energy eigenvalue corresponding to the 
respective eigenvector. However, the absolute values of these energy eigenvalues 
are again not important as the dynamics of the state is determined by the difference 
between them. I emphasized this point in the last line of Eq. 4.13 by factoring out 
one of the time-dependent exponential factors. It is clear that the overall phase factor 
will again disappear from all experimentally relevant expressions, and the entire 
time dependence will be determined by exp (— i E2 ^ El t). Apparently, it would not 
matter for this dynamic if I factored out the other exponential factor. To illustrate 
this point, I will now compute an expectation value of some generic operator with 
the state described by Eq. 4.13: 


(a(t) | T \a(t)) = exp 




Tlai |;a) + a 2 exp 


(*il +«2 \X 2 exp 
E 2 -E l 


0 


E 2 -E 1 

fl 




|^l| 2 7’ll + |<22| 2 722 + 
E 2 - E\ 


T\ 2 a\ a 2 exp I - 




\ | 'y * ( E 2 ~ E l , 

t J + i 2 \a 2 ai exp I i ——- 1 


where Ty = ( /i\T | Xj)- Taking into account that for Hermitian operators diagonal 
elements are real-valued and nondiagonal are complex conjugates of their trans¬ 
posed elements (7# = T* t ), this can be written down as 


(a(t)\f\a(t)) = \a l \ 2 Tn + \a 2 \ 2 T 2 2 + 


2\T n \ |ai| \a 2 \ cos 


x £ 2 - Ei 


t + 8 t 2 j + 8 ai — 8 ai 


(4.14) 
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where <5s are phases of elements appearing in the corresponding subindexes. 
This expression is explicitly real and is periodic with frequency dependent on 
the difference of energies (E 2 —E\)/fi. I will leave it to you as an exercise to 
demonstrate that this result wouldn’t change if you factor out exp (i^t) instead 


of exp (if- 1). 


In a general case, expanding an arbitrary initial state vector in the basis of 
the eigenvectors of the Hamiltonian, you can see that a time dependence of the 
vector representing the state of the system is obtained by adding the corresponding 
exponential factors exp (-ijj-t) in front of each | Xn) term in this expansion: 



(4.15) 


n 


Expansion coefficients a n are determined by the initial state l^o) with the help 
of Eq. 2.24. Equation 4.15 essentially solves the problem of quantum dynamics 
provided one knows eigenvalues and eigenvectors of the system’s Hamiltonian. For 
this reason solving the time-independent Schrodinger equation is one of the main 
technical problems in quantum theory. Respectively, much of this text as well as of 
all other books on quantum mechanics will be devoted to devising various ways of 
doing so. 

If the Hamiltonian has a continuous spectrum of energy, the same idea for 
generating the time-dependent state from an initial state still works. One only needs 
to replace the sum over the discrete index in Eq. 4.15 by an integral over a relevant 
continuous quantity k labeling states of the system to get 



(4.16) 


Coefficients a(k) are again determined by an initial state in exactly the same way 
as in the discrete case (you will be well advised to remember though that the 
operational definitions of the inner product can be very different in the discrete and 
continuous cases). 


4.1.3 Ehrenfest Theorem and Correspondence Principle 

I want to finish the discussion of the Schrodinger picture by deriving the so- 
called Ehrenfest theorem, which is concerned with the dynamics of the expectation 
value of a generic Hermitian operator A(t), which might have its own explicit time 
dependence. Assuming that the system is in state \a(t)), I will derive a differential 

equation for quantity |a(^)| = (a(t)\A\a(t)), where (A(t)^j is a frequently used 
shortened notation for the expectation values. This expression can be differentiated 
using standard rules for differentiation of a product of several functions: 
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dt 


d -^A HO) + («(01 ^ HO) + («(0I 


i i dA 

- ( a(t)\HA |a(0) - g (a(OI^HO) + (a(01 ^ HO) = 


i 

h 




(4.17) 


In the Schrodinger picture, the operators are devoid of their own dynamics. The time 
derivative in the last term of the Ehrenfest theorem takes into account a possibility 
of an external time dependence of an operator, which is not related to their internal 
dynamics. This explicit time dependence is a reflection of the changing environment 
of the system, such as a time-dependent electromagnetic field interacting with an 
atom. If operator A does not have such an externally imposed time dependence, 
then the last term in Eq. 4.17 vanishes, and the dynamics of the expectation value of 
the observable represented by A is completely determined by its commutator with 
the Hamiltonian. 

There is a special class of observables, whose operators commute with Hamilto¬ 
nian. You already know that such observables are compatible with Hamiltonian, 
i.e., they would have a definite value if the system is in one of its stationary 
states. Ehrenfest theorem shows that such observables have an additional property— 
regardless of the state of the system, their expectation values do not depend on time. 
In other words, the expectation values of observables whose operators commute 
with the Hamiltonian are conserving quantities. 

Finally, I would like you to note a remarkable similarity between the Ehrenfest 
theorem and Eq. 3.8 expressing time derivative of a classical function of coordinate 
and momentum in terms of its Poisson brackets with the classical Hamiltonian: the 
two equations become identical if one makes a substitution {• • • } —> — (i/ti) [• • • ]. 
This similarity is physically significant as illustrated by the following example. 
Let me apply Ehrenfest theorem to a very special but extremely important case 
of coordinate and momentum operators of a single particle described by a time- 
independent Hamiltonian, like the one given in Eq. 3.48. For simplicity I will limit 
myself to a one-dimensional case, so that I will only need to consider one component 
of the position and momentum operators and can treat the potential energy as a 
function of a single coordinate only. The Ehrenfest theorem involves commutators 
of the respective operators with the Hamiltonian. In the case under consideration, I 
have to compute [x,H\ and \p x ,H\ for H given by the one-dimensional version of 
Eq. 3.48, which I reproduce below for your convenience: 

/v p 2 

H = — 4- V(x). 
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The easiest commutator to compute is [x, H ]: 



(4.18) 


where I used the fact that x commutes with V(x) as well as identity 3.24 and 
canonical commutation relation, Eq. 3.47. It takes a bit more labor to compute 
\p, H] = \p,V(x)]. In order to evaluate this commutator, I first present the potential 
energy as a power series: 




n =0 


so that I can write 


I Px,V(x)\ 


OO 


E 


1 d n V 
n\ dx n 


[p,V']. 


Again using identity 3.24 to evaluate commutator \p,x n ], I get 

|/3, F] = -itmx 11 - 1 , 


substitution of which into the previous equation yields 


OO 

\px, V(x)] = -ifi y] 

n= 1 


1 1 
(n— 1)! dx n 


^ n\ dx n+x 

n =0 


Here I took into account that n = 0 term of the initial series is a constant and 
vanishes upon the differentiation. Correspondingly the summation in the middle 
expression above begins with n = 1. In the next step, I changed the dummy index 
of summation n — 1 n, turning n = 1 term into n = 0, n = 2 into n = 1, and 
so on. This process naturally forces to replace n — th derivative with n + 1th and x n 
with v n+1 . All what is now left is to recognize that the final resulting power series is 
the expansion of the derivative of function V(v): 

dv _ y' 1 dn+ly ■^ 

dx ^ n\ dx n+l 

n =0 


Thus, I can proudly present 


\px, v(x)] = —iti —. 


( 4 . 19 ) 
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Now, the Ehrenfest theorem for these two operators becomes 

d(x) _ (Ar) 

dt m 

d{Px} _ _I^X\ 

dt \dx I 

Repeating the same calculations for all three components of the position and the 
momentum vectors, you can easily obtain the three-dimensional version of these 
equations: 


d{r) 

dt 

d_(p)_ 

dt 


<p> 

m 

-(VV) 


(4.20) 

(4.21) 


where V V (in case you forgot) is a gradient of V defined in the Cartesian coordinates 
with unit vectors e x , e y , and e z in the direction of the corresponding axes X, Y , and 
Z as 


dv 

vv = ex ir 

ax 


dV 

dy 


' e y n e Z M • 


dv 

d z 


The obtained equations resemble classical Hamiltonian equations, but it actually 
would be wrong to say (as many textbooks do) that Ehrenfest equations make 
expectation values of position and momentum operators to behave like correspond¬ 
ing classical quantities. In reality these equations do not even constitute a closed 
system of equations, which becomes almost obvious once you realize that generally 
speaking (V (£)) ^ V ((£)). Equality here is realized only if the potential energy 
is either a linear or a quadratic function of the coordinates. In the former case 
—dV/dx = F = const , so that the Ehrenfest equations have a simple solution: 


(p) =Po+ Ft; 

(x) = x 0 + ( p 0 /m)t+ (1/2) ( F/m)t 2 , 

reproducing classical equations for a particle moving with constant acceleration. In 
the case of a quadratic potential (harmonic oscillator) 

—dV/dx = kx 


so that 


[dV / dx^j = k (x) 
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reducing the Ehrenfest equations to classical equations describing a harmonic 
oscillator. To illustrate the difficulty arising in a more general situation, consider 
V(x ) = ax 3 / 3. In this case Ehrenfest equations become 


d{x) = (p) 

dt m 


d (p) 
dt 


= —a 


(* 2 >. 


Since (v 2 ) ^ (x) 2 , the resulting system of equations is not complete, because now 
you need to derive a separate equation for (v 2 ). Trying to do so (see the exercises) 
will appear as a recurring nightmare—you will end up with new variables at each 
step, and this process will never end. You might wonder, of course, that maybe it is 
possible to find such a state for which (x n ) = ( x ) n , in which case Ehrenfest equations 
will literally coincide with the Hamiltonian equations. It is not very difficult to 
prove that the only state in which this might be true is the state represented by 
the eigenvector of the coordinate operator. Unfortunately, even if at some time 
t = 0 you can create a system in such a state, it will lose this property as it 
evolves in time. To see that this is indeed the case, imagine a state \x(t)) such that 
x |v(0) = x(t) \x(t)) and try to plug it in Eq. 4.9 describing the dynamics of quantum 
states. You will immediately see that since the coordinate and momentum operators 
do not commute, this state cannot be a solution of the time-dependent Schrodinger 
equation. 

There is, however, another way to make Ehrenfest equations identical to their 
classical Hamiltonian counterparts. All what you need to do is to neglect quantum 
uncertainty of coordinate and momentum. Since technically these uncertainties arise 
from the canonical commutation relation, you can do away with them by passing to 
the limit fi ^ 0. The emergence of Hamiltonian equations, in this so-called classical 
limit, is a very attractive and soothing feature of quantum formalism indicating that 
the developed theory adheres to the correspondence principle formulated (again!) by 
Niels Bohr. This principle played an important heuristic and philosophical role in the 
development of quantum theory. It states that the quantum theory must reproduce 
the results of classical physics in situations where classical physics is known to be 
valid. Even though the concrete mathematical expressions defining situations when 
the quantum description must reduce to the classical one vary from phenomenon 
to phenomenon, they all involve taking the limit fi 0. In this limit, for instance, 
the quantum of energy fico introduced by Planck vanishes, or de Broglie wavelength 
A = h/p goes to zero, and quantum uncertainties of various observables, which 
prevented you from replacing ( x n ) (x ) n , disappear. 
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4.2 Heisenberg Picture 

As I mentioned in the beginning of this chapter, quantum dynamics can be described 
by imposing a time dependence on operators rather than on the states. This 
approach, a version of which was designed by Heisenberg, Bom, and Jordan, was 
historically first, is directly connected with classical Hamiltonian equations, and 
is quite popular in current research literature on quantum mechanics, especially in 
quantum optics. However, for some reasons it rarely appears in undergraduate texts 
on quantum theory. Probably, it is believed that the idea of time-dependent operators 
is too complicated for infirm minds of undergraduate physics majors to comprehend, 
but I personally do not see why this must be the case. So, let’s try to remove the veil 
of mystery from this alternative version of quantum theory, called the Heisenberg 
picture. 

At first glance, the idea of time-dependent operators seems indeed quite strange: 
if an operator is, e.g., a prescription to differentiate a function, how can this rule 
change with time? The best way to answer this type of question is to first develop 
a formal way to describe the time dependence of operators and, then, to illustrate it 
using a few simple examples. 

I begin by considering an expectation value of some arbitrary operator A in state 
| a(t)): (a(t)\A\a(t)). Using the time-evolution operator defined in Eq. 4.2, I can 
present this expectation value as 


(A(t)) = (a 0 | U\t,0)A(t)U(t,0) |a 0 > • 


(4.22) 


Lumping together all three operators appearing between the bra and ket vectors in 
Eq. 4.22 into a new operator 


A H (t) = U'(t,())A(t)U(t,0) 


(4.23) 


yields a time-dependent Heisenberg representation of the initial operator. This 
time dependence has, in general, two sources: an external time dependence of the 
initial Schrodinger operator discussed in the previous section and an internal time 
dependence responsible for the quantum dynamics of the system represented by the 
time-evolution operators. Differentiating this equation with respect to time, I obtain 
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Using Eq. 4.8 this can be rewritten as 


dA " (t) = ())HA(t)U(t, ())— 

at n 

l -U\t,0)A(t)HU(t,0) + U'(t,0) dA ^ t) -U(tA)). 
n at 

Taking advantage of the unitarity of the time-evolution operator, I insert combina¬ 
tion UW=I between the Hamiltonian and operator A in the first two terms of the 
equation above. This procedure yields 

dA " (,) = 1 t/f (f, o )HU U f A(t)U(t, 0) - 
at n ^^- v -✓ 

l - U' ( t , 0 )A(t)UU^HU(t, 0) + U' (t, 0) U(t, 0) 
fl "- v -' ^- v - / at 


where each of the bracketed terms defines, according to Eq. 4.23, a Heisenberg 
representation of the corresponding operator. Therefore, the last equation can be 
rewritten as 



The resulting equation is called the Heisenberg equation for time-dependent 
operators. It looks very much like the Ehrenfest theorem, and just like the latter, it 
resembles the classical Eq. 3.8. However, unlike Ehrenfest theorem, the Heisenberg 
equation describes the time evolution of operators rather than of the expectation 
values and, therefore, does not suffer from perpetual emergence of new variables. 
The Ehrenfest theorem can be obtained from Eq. 4.24 by computing the expectation 
values of both sides of this equation with an initial state |ofo). The initial condition 
for the Heisenberg equation can be easily ascertained from Eq. 4.23: setting t = 0 in 
this equation immediately yields that the initial conditions for Heisenberg operators 
are given by the corresponding Schrodinger operators, establishing an intimate 
connection between the two pictures. Now you can answer the question posed in 
the beginning of this section: how can a rule representing an operator change with 
time? The time dependence comes from combinations of various immutable rules 
with time-dependent coefficients. You will see the example of this a few paragraphs 
below. 

Hamiltonian appearing in Eq. 4.24 is the Heisenberg representation of the regular 
Schrodinger Hamiltonian and must be evaluated before Heisenberg equations can 
be used. However, in the special important case of a time-independent Schrodinger 
Hamiltonian, one can easily show that H h = H. Indeed, one can easily infer from 
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Eq. 4.10 that the time-evolution operator for time-independent Hamiltonian is 

U(t, 0) = exp (- iHt/h ) . (4.25) 

This operator obviously commutes with Hamiltonian, and as a result we have 
H h = U^HU = U l UH = H. 


The same arguments apply to any Schrodinger operator commuting with Hamilto¬ 
nian, so all such operators remain independent of time. This result also obviously 
follows from Eq. 4.24. Thus, one can say that operators commuting with Hamilto¬ 
nian represent quantum conserving observables not only at the level of expectation 
values as in the Ehrenfest theorem but at a deeper level of operators themselves. 

In the case of Hamiltonians with explicit time dependence, you can no longer 
claim that the Heisenberg representation of the Hamiltonian coincides with the 
Schrodinger one. The Heisenberg picture in this case loses its immediate appeal, and 
people often prefer a picture, intermediate between Schrodinger’s and Heisenberg’s, 
called interaction representation. In this representation the Hamiltonian is divided 
into time-independent and time-dependent parts: 

H{t) = H 0 + V(t). 


The transition to new operators is carried out now using the time-evolution operator 
U(t, 0) = exp (—iHot/h'j. As a result one ends up with both operators and the 
state of the system displaying dependence of time: the dynamics of the operators is 
defined by operator H 0 and the dynamics of the states by V(t). However, something 
tells me that continuing with this line of thought would bring me way over the line 
allowed in the undergraduate course. So, consider this as a teaser and preview of 
things to come if you decide to deepen your knowledge of quantum theory. 

Now back to the time-independent Hamiltonians. The same calculations as in 
the case of Ehrenfest equations yield the following Heisenberg equations for the 
one-dimensional motion of a quantum particle: 


dx = px_ 
dt m e 

dp x _ dV 
dt dx 


(4.26) 

(4.27) 


which coincide with the respective Hamiltonian equations of classical mechanics. 
Again just like in the case of Ehrenfest theorem, these equations can be easily 
generalized for the three-dimensional case: 
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dr p 

dt m e 


dp 

dt 


= -VV. 


(4.28) 

(4.29) 


To illustrate how Heisenberg equations work, let me consider the case of a one¬ 
dimensional harmonic oscillator—a particle moving in a quadratic potential of the 
form V = | mco 2 x 2 , in which case Eq. 4.27 becomes 


dp x 

dt 


= —mco 2 x. 


(4.30) 


Differentiation of this equation with respect to time yields a differential equation of 
the second order: 


d 2 p x 

dt 2 


= -W 2 Px 


where the term dx/dt is replaced with p/m with the help of Eq. 4.26. It is 
straightforward to verify that the equation for the momentum operator is solved 
by 

pit) = 7C\ COSGOt + Tt2 smoot (4.31) 

while an expression for the coordinate operator is obtained from Eq. 4.30 by simple 
differentiation: 


x(t) = -sin cot -cos cot. (4.32) 

mco mco 

Unknown operators jti ,2 in Eqs. 4.31 and 4.32 are to be determined from the initial 
conditions. (General solution of any linear differential equation is a combination of 
particular solutions with undefined constant coefficients. Since we are dealing with 
operator equations, these unknown coefficients must also be operators.) Substituting 
t = 0 in the found solutions, you can see that 


^1 — Px 0 


if 2 = —mcox o 


so that, just as I advertised, the time-dependent momentum and coordinate operators 
are expressed as linear combinations of Schrodinger operators p x o and xo with time- 
dependent coefficients 
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p{t) = p x o cos cot — mcoX{) sin cot 

x(t ) = sin cot + xo cos cot. (4.33) 

moo 

The obtained solution for the Heisenberg operators looks identical to the solution of 
the classical harmonic oscillator problem, but do not get deceived by this similarity. 
For instance, in classical case one can envision initial conditions such that either 
or p x o (not both, of course) are zeroes. In quantum case these are operators and 
cannot be set to zero. At this point you do not know enough about properties of the 
quantum harmonic oscillator to analyze this result any further, so I will postpone 
doing this till later. Still, we can have a bit of fun and, as an exercise, calculate the 
commutator between operators p(t) and x(t ), taken not necessarily at the same time. 

Example 15 (Commutator of Heisenberg Operators for Harmonic Oscillator) 


[x(ti),p(t 2 )] = sin cot\ cos cot 2 


Px o 
moo 


,Px o 


+ COS C0t\ COS C 0 t 2 [xo,Pxo] ~ 


sin cot\ sin cot 2 \p x o, *o] — sin cos cot\ [vo, mcox o] = 

ifi cos oot\ cos cot 2 + ifi sin oot\ sin cot 2 = ifi cos [co (t\ — ^)] 


It is interesting to note that this commutator depends only on the time interval t\ —1 2 
and not on t\ and t 2 separately. The equal time commutator (t\ = t 2 ) coincides with 
the canonical commutator for Schrodinger coordinate and momentum operators. 

It is also fun to think about eigenvectors and eigenvalues of the Heisenberg 
operators in Eq. 4.33, but I will let you play this game as an exercise. 


4.3 Problems 
Section 4.1.1 


Problem 45 Consider a Hamiltonian presented by a 2 x 2 matrix 

- , T cos# sin# exp (iw) 

H = fua . a ■ 

sin # exp (—up) — cos # 

1. Find Hermitian conjugate and inverse matrix and convince yourself that this 
operator is simultaneously Hermitian and unitary. 

2. Using representation of an exponential function as a power series, evaluate the 
time-evolution operator for this Hamiltonian. 
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3. Assume that the initial state of the system is given by a vector 


l«o) = 


1 


i 

i 


and find | a(t)) using the time-evolution operator. 


Section 4.1.2 


Problem 46 Prove that \fH\x n ) = E n \Xn), then H m \Xn) = \ Xn)• 

Problem 47 Consider a system with Hamiltonian 


H 


2m ( 


+ V(r). 


Assume that you know its eigenvalues and eigenvectors | Xn) and E n . Show that if 
you change the potential in this Hamiltonian to V(r) — Eq, all the eigenvectors will 
stay the same, the eigenvalue Eo will become equal to zero, and all other eigenvalues 
will become E n — Eo. 

Problem 48 Re-derive Eq. 4.14 factoring out exp instead of exp and 

demonstrate that this result does not change. 

Problem 49 Consider a system described by a Hamiltonian 


H = E 0 


1 ia 
—ia 1 


1. Find stationary states of this Hamiltonian. 

2. Assuming that at t = 0 the system is in the state 


|« 0 > = 


1 

7 ! 


i 

i 


find |a (7)) using stationary states of the Hamiltonian. 


Section 4.1.3 


Problem 50 Prove that (a\x 2 \a) = ((a\x la)) 2 if and only if x\a) = x \a). It is 
not very difficult to prove that x\a) = x \a) implies (a\x 2 |a) = ((ot\x |a)) 2 , but 
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the proof of the opposite statement requires a bit more ingenuity. You can try to 
prove it by demonstrating that if x\a) 7 ^ x\a), then (a\x 2 \a) cannot be equal to 
(( a\x\a)f. 

Problem 51 Go back to the problem involving a one-dimensional motion of a 
particle in the cubic potential V = ax 3 /3 discussed in Sect. 3.3.1. It has been shown 
in the text that the Ehrenfest equation for (p) involves the expectation value of (x 2 ). 
Derive the Ehrenfest equation for this quantity. Do you see expectation values of 
any new operators or a combination of operators in the equation for (x 2 )? Derive 
Ehrenfest equations for those new quantities. Comment on the results. 


Section 4.2 

Problem 52 You will learn in the following section that quantum states can be 
described by functions of coordinates—wave functions, in which case Schrodinger 
momentum operator becomes 


px of(x) = —ih-j~ 
ax 


while the coordinate operator becomes simple multiplication by the coordinate 
xo \/f(x) = x\j/(x). Using this form of time-independent operators, find the functions 
representing eigenvectors of their time-dependent Heisenberg counterparts: 


p{t) = p x 0 cos cot — mcox 0 sin cot 

x{t) = -sin cot + xo cos cot. 

mco 

Analyze the behavior of these eigenvectors as functions of time; especially, consider 
limits t = Ttn and t = tt/2 + Tin , where n = 0,1,2 • • •. Hint: Time-dependent terms 
here are just parameters, and their time dependence does not affect how you shall 
solve the respective differential equations. 

Problem 53 Derive Heisenberg equations for operators a , a/ and b , A' appearing 
in the following Hamiltonian: 

H = ticoa'a + tiQb^b + k (a"b + b f a^ . 

Commutation relations for these operators are as follows: 


[a,a t ] = 1; = 1; = 0; = 0. 



Chapter 5 

Representations of Vectors and Operators 


Check for 
updates 


5.1 Representation in Continuous Basis 

We have managed to get through four chapters of this text without specifying 
any concrete form of the state vectors, and treating them as some abstractions 
defined only by the rules of the games that we could play with them. This 
approach is very convenient and rewarding from a theoretical point of view as it 
emphasizes the generality of quantum approach to the world and allows to derive a 
number of important general results with relative ease. However, when it comes 
to responding to experimentalists’ requests to explain/predict their quantitative 
experimental results, we do need to have something a bit more concrete and tangible 
than the idea of an abstract vector. The similar situation actually arises also in the 
case of our regular three-dimensional geometric vectors. It is often convenient to 
think of them as purely geometrical objects (arrows, for instance) and derive results 
independent of any choice of coordinate system. However, at some point, eventually, 
you will need to get to some “down-to-earth” computations, and to carry them out, 
you will have to choose a coordinate system and replace the “arrows” with a set of 
numbers—the vector components. 

In the case of abstract vectors that live in an abstract linear vector space, you can 
use the same idea to get a more concrete and handy representation of the quantum 
states. All these representations require that we use a basis in our abstract space. 
It seems more logical to begin with representations based on discrete bases, but in 
reality, we are somewhat forced to start with continuous bases. The reason for this 
is that two main observables in quantum mechanics, from which almost all of them 
can be constructed, are the position and the momentum (see Sect. 3.3.2). Operators 
corresponding to these observables have continuous spectrum, and, therefore, you 
will have to learn how to represent these operators using continuous bases. 
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5.1.1 Position and Momentum Operators in a Continuous 
Basis 

Let me begin with some abstract operator C and a continuous basis formed by 
orthonormalized vectors \q): 


{q\ q') = 8 (q - q') , 

where q is a continuously changing parameter. You already know from Sect. 2.3 that 
an abstract ket vector \a) can be presented as an integral: 

l«> = J dq(p a (q) | q) . (5.1) 

Hermitian conjugation of this expression produces its bra counterpart: 

(a\ = J dq<p*(q)(q\. (5.2) 

I can now write down the inner product between vectors |a) and |/3) as 
{a\ P) = Jdq Jdq'(p*(q)(pfi(q') (q\ q') = 

J dq J dq'cp*(q)(pp(q')8 (q - q') = J dq<p*(q)(pp(q). (5.3) 

Subindexes a and /3 in these expressions indicate the correspondence between an 
abstract vector and the respective function appearing in the superposition given by 
Eq. 5.1. Applying Eq. 5.3 to the case when \a) = |/3), I reproduce Eq. 2.42, and by 
recalling that all state vectors must be normalized, I end up with condition 

J dq<p*(q)<p a (q) = 1 (5.4) 

generalizing Eq. 2.43, which was originally derived only for the functions of 
coordinates. It should be noted here that while I am using a single variable q as 
an argument of the functions cp ( q ), you must understand that it is just a convenient 
notation, and in reality, q can represent several variables. For instance, eigenvectors 
of the position operator depend on three components of the position vector, but we 
have been using the single symbol r to designate them all. 

As long as we all agree on the choice of the basis, and do not change it in the 
middle of a conversation (or calculations), we have a one-to-one correspondence 
between abstract vectors and the respective superposition coefficients. This function 
(Pa(q) provides a complete description of the corresponding vector and can be, 
therefore, considered as its faithful representation. It can be expressed in terms of 
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vector |a) and the basis vectors by premultiplying Eq. 5.1 by (q'\ and using the 
orthonormality condition: 


<Pa(q) = (q I <*} • (5.5) 

This essentially completes the discussion of the representation of vectors, but 
this was an easy part. You also need to learn how to find representation of operators 
appropriate for the developed representation of vectors, which is the hard part. For 
starters, I need to explain to you what it means to represent an operator. Consider an 
expression 


\P) = Q\<*) 

where two abstract vectors are related to each other by abstract operator Q. It 
seems reasonable to define a representation of the operator as such an object that 
would yield the same relation between functions cp a (q ) and cpp(q) representing the 
corresponding abstract vectors \a) and |/3). In order to figure it out, let me try to 
insert the completeness condition for the continuous spectrum, Eq. 3.45, formed 
with basis vectors | q) into three places in |/3) = Q \a), in front of vector |/3), in 
front of operator Q , and between the operator and \a): 


J dq | q) {q \P) = J dq J dq ' \q) {q\Q \q ) (<?' |a). (5.6) 

(This amounts to inserting unity operators in all places, so, obviously, I have not 
changed anything.) Using Eq. 5.5 I get 


J dq \q) <pfi(q) = / dq J dq'\q) {q\Q\q')(p a (q') |g) 


which can be rewritten as 


J dq | q) (pfiiq) - J dq' \q) ( q\ Q \q') (p a (q') 


= 0 . 


Since the vectors of the basis are linearly independent, the integral in this expression 
can be zero only if the integrand is zero, which yields 


Vfi(q) = J dq' {q\ Q \q')(p a (q') = j dq'Q(q,q')(p a (q') 


(5.7) 


where I introduced Q(q,q') = (q\Q\q'). Thus, the abstract operator Q in the 
continuous basis takes the form of an integral operator with kernel (q\ Q \q'). If we 
know how this operator acts on the basis vectors, we can determine the kernel and 
replace the abstract relation |/3) = Q \a) by an integral relation given by Eq. 5.7. 
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For instance, you can easily find the kernel if the basis is formed by eigenvectors 
of operator Q. Indeed, if you know that Q\q) = q\q), then Q(q , q ') = (q\ Q \q') = 
q8 (q — q') and Eq. 5.7 simplifies to 


<pp(<i) = 


dq Q(q, q )<p u (q ) = q(p a {q), 


(5.8) 


i.e., the integral operator is reduced to simple multiplication by a corresponding 
eigenvalue, and what can be simpler? Unfortunately, life is always more compli¬ 
cated, and in most cases, you will have to deal simultaneously with at least two 
non-commuting operators, so that eigenvectors of one operator will not be the 
eigenvectors of the other. To deal with this situation, you have to learn how to 
represent an operator in a basis formed by vectors that are not its eigenvectors. 

Most of the bases used for practical calculations are formed by eigenvectors 
of some Hermitian operator, and recognition of this fact can be quite useful. So, 
in addition to the original operator Q with eigenvectors | q) and eigenvalues q , let 
me introduce another operator S with eigenvectors | s) and eigenvalues s. The goal 
now is to find an integral kernel representing operator Q in the basis of vectors 
| s). In other words, I need to rewrite expressions (s\ Q \s') in the basis formed by 
vectors | q). It can be done by exploiting (twice) again the same old trick with the 
completeness relation expressed in terms of these vectors: 

(s| Q |s') = J dqdq' (s | q) (q\ Q\q'){q'\ s'). 

Now, taking into account that kernel Q (< q , q') in the basis of its own eigenvectors is 
Q (g, q ') = qS (q — q'), I can simplify the above expression into 

Q(s, s') = (s\ Q |/) = J dqq (s| q) (q\ s') (5.9) 

which gives me exactly what I have been looking for. For this expression to be 
useful, however, you would need to know functions /q ( s ) — (s\ q) and Xs(q) — 
(^| s). The first of them can be interpreted as the representation of eigenvector | q) 
in the basis of vectors |,s), and the second one is clearly a representation of | s) in the 
basis of |^). These functions are related to each other by the property of the inner 
product described by Eq. 2.19: Xq( s ) — /*(#)• So, the whole business of finding 
the kernel Q(s, s') is now reduced to finding representation of eigenvectors of Q in 
terms of those of S (or vice versa). 

Finding function Xs(q) is impossible without bringing in some additional 
information. It can be, for instance, a commutator between operators Q and S , 
or just outright expression for Xs(q ) obtained empirically or heuristically, on the 
ground of some physical arguments, or just by divine insight. Whatever method you 
chose, you need to specify now which operators we are dealing with. Of biggest 
interest are, of course, operators of position and momentum, so let’s agree to identify 
operator Q with v-component P x of the momentum operator and S with operator 


5.1 Representation in Continuous B asis 


119 


X —^-component of the position operator. In this case function Xq( s ) becomes the 
coordinate representation Xp x (x) of the eigenvectors of the momentum operator (its 
v-component, of course). 

This function represents a state of the particle with definite momentum p x , which, 
according to de Broglie hypothesis, corresponds to motion of a free particle and is 
described by a harmonic wave with the wave vector with v-component k x = p x /fi. 
Disregarding the time-dependent portion of such a wave, I can write its coordinate- 
dependent part as Xp x ( x ) — a ex P (ik x x) = a exp (ip x x/fi). Choice a = 1 / \Z2nfi 
generates, according to Eq. 2.36, a delta-normalized function: 


XpJx) 


1 

\/2 7th 


exp ( ipxx/h ). 


Indeed, 


1 

2 Ttfl 


oo 


/ 


eKPi-pd^dx 


— OO 


OO 

_L I e i (p*-p*r* d ~x = 8( Px - P i x ) 

— OO 


(5.10) 


(5.11) 


where x = x/ti. Similarly, you can also find that 

OO 

J_ J e i^)p,m dpx = g (x - x '). 

—OO 

Now I can write Eq. 5.9 as 


(5.12) 


= I dP - P ’‘" 


ipxx/h e ~ip x x'/h _ 


= I dpiPiC 


i(x—x')p x /h 


This integral might look puzzling, because it is naturally diverging. Applying a 
magic trick, however, I can turn it into something that actually makes sense. The 
trick is quite popular, so it is useful to have it up your sleeve. Differentiation of 
Eq. 5.12 with respect to v produces 


d (5 (v — x') i 


dx 


Inti 1 


oo 

/ 


d PxPx e‘ 


i(x—x')p x /h 


I hope that you have recognized the integral of interest on the right-hand side of this 
equation, so that you can find for P x ( x , x '): 


P ( Y J\ _ hdS(x-x') 
x[ ' } ~ i dx ' 


(5.13) 








120 


5 Representations of Vectors and Operators 


Substituting Eq. 5.13 into Eq. 5.7 with variable q replaced with s (in that equation 
q was just a generic variable not yet identified with eigenvalues of the operator), 
I derive a relation between functions (p a (x) and cpp(x) indicating states \a) and 
|/3) in the representation of the eigenvectors of the position operator (position 
representation for brevity): 


<Pf}(x) = T 


dx 


t d8 (x — x') 
dx 


fi d C 

(p a (x) =-/ dx(p a (x)8 (x — x) = —iti 

i dx J 7 


d<p a (x) 

dx 


Thus, we see that the P x operator in the position (coordinate) representation is 
equivalent to a differential operator: 

p x = -ifi^~ (5.14) 

dx 

where I used the lowercase letter for a particular representation of the operator as 
opposed to the uppercase used for abstract operators. The coordinate operator in 
the coordinate representation is obviously just an operator of multiplication by the 
coordinate’s eigenvalue. 

You can turn this analysis around and identify S with a component of the 
momentum operator and Q with v-coordinate. Then Xq( s ) becomes the momentum 
representation of the eigenvector of coordinate: 


Xx(px) = (p M = ((* I p))* 


l 

V2 nti 


exp ( -ipxx/fi ) . 


(5.15) 


Repeating all the same manipulation as before, you will end up with the coordinate 
representation of the coordinate operator in the form of a differential operator: 



(5.16) 


Equations 5.14 and 5.16 are obtained from each other by interchanging v ±> p x and 
complex conjugating the result. The complex conjugation bit is, of course, in sync 
with the fact that coordinate representation of the momentum’s eigenvectors and 
momentum representation of the coordinate’s eigenvectors are complex conjugates 
of each other. Obviously, all the same arguments can be carried out for any other 
Cartesian component of the position and momentum operators, which brings us to 
the following conclusion. The position representation of the momentum operator is 
given by the coordinate gradient operator V as 


P = 


—iti V, 


(5.17) 


while the momentum representation of the position operator is 
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where X p is defined as 


Vp 



d_ _d_ 
dp y + €z 9 p z 


(5.19) 


and e X y iZ are unit vectors of Cartesian coordinate system with axes X , Y, and Z. 
The functions representing the eigenvectors of the 3-D momentum operator in the 
position representation and of the 3-D position operator in the momentum repre¬ 
sentation are, obviously, obtained by multiplying their respective one-dimensional 
counterparts: 


Xr(p) 

X P (r) 


_ p -ip-r/h 

1 JP-r/ti 

(2 nfi) 3/2 


(5.20) 

(5.21) 


You might be wondering how we ended up with P and R being represented 
by differential rather than by integral operators as was my original intention. It 
happened thanks to a singular nature of the kernels, (r f \P \r) and {p'\ R \p), which 
turned out to be proportional to the derivative of the delta-function. And the delta- 
function derivatives are quite capable of turning integrals into derivatives, as it 
happened in this particular case. 

Having found representations of the coordinate and momentum operators, you 
can easily compute their commutator. For instance, for the x-components in the 
coordinate representation, you will easily find 


[x,P X ]f(x) 


-itix^-f(x) + iti 2- (xf(x)) 
dx ax 


ihf(x) => = ih. (5.22) 


Since this commutator is just a number, it must not depend on the particular 
representation (this is why I returned capital letters for the operators). Indeed, the 
calculations carried out in the momentum representation yield 


X,P x 


f(P ) = ih-j- (. Pf(P )) - ihpprf(p) = ihf(p) 

dp dp 


X,P 


]- 


ifi 


as expected. 

I will finish this section by deriving a relation between functions cp a (r) and 
cpaip) representing the same state \a) correspondingly in the coordinate and 
momentum representations. To achieve this, I am again resorting to the magic of 
the completeness relation based upon the eigenvectors of momentum. Substitution 
of this relation into Eq. 5.5 adapted for the eigenvectors of the position operator 
gives 
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(Pair) = (r| a) = Jdp { r\p ) {p\ a) = Jdpx r ip)(Pa(p) (5.23) 

where I used Eq. 5.20 for the momentum representation of the position operator’s 
eigenvector. One can easily invert Eq. 5.23 using Fourier representation of the delta- 
function, Eq. 2.36, to obtain 


Vuip) = - \ 3/2 [ d 3 re ,p ' r/h (Pair). 

ilntiy' J 


(5.24) 


5.1.2 Parity Operator 

In this section I want to make a slight detour and define an important operator closely 
related to the eigenvectors of the position operator used to introduce the position 
representation of the state vectors. This operator, called parity operator, is often used 
to classify wave functions arising in this representation, so it seems quite appropriate 
to talk about it here. 

The parity operator is defined by its action on the eigenvectors of the position 
operator | r) as 


ft |r) = \-r) , (5.25) 

the operation often called inversion. It is easy to see that this operator is Hermitian: 

(ft I A l r > = ^ l-ft = 8 ^ + r ) 

(<r|ft|ft))* = (<r|—ft))*=S(ft+r) 

and that it is equal to its inverse, ft 2 | r) = II \—r) = \r) =£► ft 2 = /, where / is the 
identity operator. It follows immediately from the last expression that ft = ft -1 . It 
also means that this operator is unitary. The action of this operator on an arbitrary 
state can be defined using its position representation: 

(r|ft|^) = (—r\ f) = x/r (—r) 

where ifr (r) = (r\ \/f). It is also important to know how this operator acts on the 
eigenvectors of the momentum operator, which can be found out using again the 
coordinate representation: 
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A 1 p) = J dr (r| p) ft |r) = J dr (r| p) |-r) = J dr (~r\p) |r). 

To derive this result I used coordinate representation of \p) and changed the 
integration variable r to — r in the last integral. Finally, using Eq. 5.21 I can write 
(—r\ p) = (r | —p), which results in the following transformation rule for | p): 

ft i p) = I ~P) • (5.26) 

Parity operator has only two eigenvalues: 1 or —1. Indeed, assume that | n) is an 
eigenvector with eigenvalue a : IT |jr) = cr |jt). Apply the parity operator to this 
relation again: ft 2 \n) = all \n) <<=>► \n) = a 2 \n) a = =bl. Accordingly, 
eigenvectors of II represent those states that either do not change upon inversion (we 
can call them even states) or those that change their sign (odd states). Obviously all 
even and all odd functions are coordinate representations of the eigenvectors of the 
parity operator. 

Parity operator is one of the simplest symmetry operators, which means that it 
can be used to determine that a Hamiltonian or another operator corresponding to 
a quantum observable does not change, when a system is transformed in a certain 
way. In fancy language, this property is called invariance with respect to a certain 
transformation. To see why such invariance can be important, consider the time- 
independent Schrodinger equation: 

H\a) =E\a) 

and assume that there is an operator (usually a unitary one), which can be used 
to describe a transformation of the system. Parity operator is one such example: it 
generates spatial inversion of the system with respect to an origin of the coordinate 
system. Rotations with respect to an axis or a point provide other examples of 
transformations described by unitary operators. In what follows I will use notation 
for the parity operator for the sake of concreteness, but most conclusions in the next 
paragraph will be applicable to any symmetry operator. 

So, let me apply operator II to the time-independent Schrodinger equation. In 
addition, I will also insert expression XT -1 IT, which is obviously equal to the identity 
operator, between H and |a): 

ftM _1 ft|a) = EU \a). 

The Schrodinger equation preserves its form when rewritten in terms of new vector 
|a) = n \a) and new Hamiltonian H' = This exercise demonstrates 

that a relation between vectors and operators is preserved if a transformation of a 
vector is accompanied by the corresponding transformation of the operator. I can 
now give a formal definition of the invariance of a system, which I earlier loosely 
described by saying that “the system does not change” upon certain operation, 
i.e., the system is invariant under a transformation if its Hamiltonian obeys the 
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following condition: H' = ftftft -1 = H. One of the immediate consequences of 
this condition is that if \a) is the eigenvector of the Hamiltonian, then \a) = ft la) 
is also an eigenvector. This is an important conclusion, but I cannot dwell on it 
for too long as it will bring us way outside of our comfort zone. What is more 
important for us is that condition ft//ft _1 = H implies that the Hamiltonian and the 
transformation operator commute TIH = TIH. This information can be immediately 
put to use because we already know what this means—the transformation operator 
and the Hamiltonian have a common set of eigenvectors. Usually eigenvectors of 
the former are known, and this knowledge makes finding of the eigenvectors of the 
latter easier. For instance, if I prove that my Hamiltonian is invariant with respect 
to parity transformation, I can immediately conclude that all eigenvectors of the 
Hamiltonian are presented by either even or odd functions, which, as you will see 
in Sect. 6.2, significantly simplify their computation. 

Hamiltonian is not the only operator whose behavior under parity transformation 
is of interest. Other operators worthy of our consideration are position and momen¬ 
tum operators. Let me begin with a position operator defined, as you well know, by 
r\r) = r \r) . Performing the same manipulation with this expression as the one to 
which I just subjected Hamiltonian, I will have 

nrn _ 1 n|r> = rft I r). 

Using Eq. 5.25 I transform this into 

ftrfr 1 |—r) = r | —r) 

which only makes sense if 

ftrA -1 = —r. 

This result demonstrates that the position operator changes its sign upon inversion, 
which, after some reflection, appears as almost obvious. Operators which have this 
property are called “odd” as opposed to “even” operators, which do not change upon 
parity transformation. Obviously, the inversion-invariant operators are by definition 
“even.” I will leave it for you as an exercise to prove that the momentum operator is 
also “odd.” 


5.1.3 Schrodinger Equation in the Position Representation 

The position representation is the most popular in practical applications of quantum 
theory. This is the representation in which the original de Broglie matter waves 
were described and in which Schrodinger wrote his equation. Much of the classical 
physics deals with processes occurring in space and time, so it is not surprising 


5.1 Representation in Continuous B asis 


125 


that the wave functions written in the position representation hold a special place 
in our hearts. 1 It is also important, of course, that the potential energy operator, 
which might have quite elaborate position dependence, looks the simplest in the 
position representation. The momentum operator, on the other hand, does not have 
a significant multiplicity of the forms appearing mostly in kinetic energy as p 2 term, 
whose coordinate representation looks quite tolerable. 

To derive the coordinate representation of the Hamiltonian, I need first to 
resolve a few technical questions. In particular, I need to know how to generate 
a representation of the product of two operators from representations of individual 
factors. Consider, for instance, operator expression QS , whose integral kernel in 
some basis |A) is (A'| QS |A). Inserting a completeness relation (again!) between the 
operators, I obtain 

(A'| QS |A) = J dX" [X'\ Q |A//) (A//| S |A) = J dX"Q{X', X")S{X” , A). (5.27) 

An important example is operator P\, whose position representation would be useful 
to know. The integral kernel for P x was found in the previous section as Q (x', x") = 
ifiS' (x' — x") , where ' on the delta-function signifies differentiation with respect to 
the first argument. Substitution of these expressions into Eq. 5.27 yields 


[ X '\H M 


, 7 f „d8 ( x f — x") d8 (x" — x) 

n / dx - 

J dx' dx" 


l2 d 2 8 (x' — x") 


dx'dx" 


= -fi 


, d 2 8 (x r —x) 2 d 2 $ (x ~ xf) 


dx' 2 


= -fl z 


dx 2 


where in the last line, I used evenness of the delta-function to switch from 8 ( x 7 — x) 
to 8 (x — x') and the chain differentiation rule to change the differentiation variable. 
If you plug this result into expression 


q>p(x') = f (x'\P 2 x \x)<p a (x)dx, 

you will get 

, /x 2 fd 2 8(x-x') 2 d 2 (p a (x) 

(pp(x) =-ti / -- (p a (x)dx = -h — (5.28) 

which means that the coordinate representation of P 2 operator is just the square 
of —ifid/dx operator (could have guessed this, of course, but this derivation was 


'You might remember that the lack of spatial-temporal picture was the main complaints 
Schrodinger leveled against Heisenberg’s “transcendental” algebraic approach. 
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a nice exercise, wasn’t it?). Obviously the same result can be obtained for two 

-2 

other components of the momentum, which means that operator P in the position 
representation is given by 


p z = -/rV 2 , 


(5.29) 


where V z = V • V is the Laplacian operator. Using Eq. 5.19, one can easily derive 


_9^ a 2 _9_ 

dx 2 dy dz 2 


Since the action of the position operator in the position representation amounts to 
the simple multiplication by position vector r, the position representation of the 
potential energy operator V (r) amounts to multiplication by V (r). Thus the action 
of the entire Hamiltonian in the position representation can now be described as 


H r V a (r) = J dr' (r\H |r') {/) = 

2 ^- J dr ' ( r \P 2 K) (r') + J dr' (r| V \r') V a (r) = 

- v2 ^ (f> + J dr'V(r') (r\ r') V a (/) = 

- (r) + J dr'V(r')8 (r - r') H> a (/) = 

- (r) + V (r) 9 a (r) , (5.31) 

2m e 

where TU (r) stands for (r |a). Correspondingly, I can write down the Hamiltonian 
in the position representation simply as 

H r = -^V 2 + V(r,t) (5.32) 

2m 

which acts on functions ^ (r, t) realizing the position representation of the corre¬ 
sponding quantum states. 

The time-dependent Schrodinger equation in the coordinate representation is 
obtained from Eq. 4.9 by premultiplying it with the basis bra vector (r\ and using 
the completeness relation: 


ifi 


d (r la) 


/ 


dr' ( r\H \r')(r f |a). 


dt 
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The left-hand side of this equation is simply ^(r, t) (I will drop the subindex a from 
now on), while the right-hand side was evaluated just a few lines above in Eq. 5.31. 
Thus, I can write the position representation of Eq. 4.9 as 


ifi 


dV(r,t) 
3 1 


ti 2 9 

v 2 + y(r,0 
2m 


V(r, t). 


(5.33) 


This is what most of quantum mechanics textbooks call the celebrated time- 
dependent Schrodinger equation governing quantum dynamics of a single-particle 
quantum state represented by wave function If the potential function in 

Eq. 5.33 does not depend on time, one can separate time and coordinate dependence 
of the wave function as 


ij/(r, t ) = exp 



f(r) 


(5.34) 


where iff (r) obeys equation 


-V 2 

2m 


V(r) 


xj/ (r) = Exj/ (r) 


(5.35) 


often called time-independent Schrodinger equation. Rewritten in the form 
H r x/f (r) = Exj/ (r), where subindex r points to the position representation, it 
becomes reminiscent of Eq. 4.11 defining eigenvalues and eigenvectors of the 
Hamiltonian. Obviously, Eq. 5.35 produces eigenvectors of the Hamiltonian in the 
position representation. 

This equation, which is a linear differential equation of the second order, has to be 
complemented by boundary conditions specifying behavior of the wave functions at 
infinity. They depend on the type of spectrum (discrete or continuous) the respective 
wave functions belong to. If the eigenvalue E belongs to a discrete spectrum, we 
know from the discussion in Sect. 2.2 that the corresponding states are square- 
integrable, which means that integral f \x/s (r)\ 2 dr taken over the entire volume (it 
defines the norm of the state vector in the coordinate representation; see Eq. 2.43 
or 5.4) is finite. Only functions which tend to zero fast enough when |r| -> oo will 
satisfy this requirement. Thus, the boundary condition for the wave functions of 
discrete spectrum can be formulated as 


lim |^r(r)|=0. (5.36) 

|r|—>oo 

The existence of a discrete spectrum depends on the behavior of the potential 
function V(r) and is closely related to the type of classical motion at a given energy. 
Imagine, for instance, that there exists a closed surface in space separating regions 
where E > V(r) from the regions where E < V(r). A classical particle can only 
exist in the latter region, because the former would correspond to negative values of 
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kinetic energies. Regions where classical kinetic energy would be positive are called 
classically allowed , while regions where kinetic energy turns negative are called 
classically forbidden. The boundary between these two regions, where E = V(r), 
forms a surface, which a classical particle cannot cross. Such motion of a classical 
particle is called bound motion. In the quantum mechanical case, Schrodinger 
Eq. 5.35 has solutions in both regions, which, however, have a completely different 
behavior. An analysis in the most generic three-dimensional case is mathematically 
too involved to attempt it here, so I shall illustrate this difference considering a 
one-dimensional model, with the wave function and the potential depending on 
a single coordinate, e.g., x. For a classically bound motion to take place in this 
case, there must exist an interval of coordinates x\ < x < X 2 , where E > V(x), 
while everywhere else E < V{x). The terminal points of this interval are so-called 
turning points, where a classical particle would momentarily stop before reversing 
its velocity. 

It is convenient to analyze this situation quantum mechanically by rewriting the 
Schrodinger equation as 

d 2 \lf(x) 2m r , x 

-ff 1 = [V(x) -E\f (x). (5.37) 

In the classically forbidden regions, which extend to infinity in both positive and 
negative directions of the coordinate axes, the second derivative of the wave function 
always has the same sign as the wave function itself. It is easier to discuss the 
meaning of this result assuming that the wave function and, respectively, its second 
derivative, in the classically forbidden region, are positive. In this case, if the first 
derivative is positive (wave function grows), it becomes even more positive so that 
the wave function bends upward growing even faster with increasing v. If, however, 
the first derivative is negative (wave function decreases), it is becoming less and less 
negative approaching zero. The wave function in this case must also asymptotically 
approach zero without ever changing its sign. This wave function would obviously 
satisfy the boundary condition given in Eq. 5.36 and, therefore, correspond to the 
eigenvalue from the discrete spectrum. If the wave function is negative, all the 
same arguments work, and the wave function is either monotonically decreasing, 
becoming even more negative, or increasing approaching zero from the negative 
side. In the classically allowed region, the second derivative is negative, and the 
solution to the equation does not have to be monotonic. The main conclusion 
following from these arguments is that the energy eigenvalues belonging to an 
interval corresponding to a classically bound motion form, in quantum description, 
a discrete spectrum. 

Wave functions corresponding to the continuous spectrum of energy usually 
appear in the situations when the potential approaches a constant finite value at 
infinity. If energy E exceeds this limiting value of the potential, than asymptotically 
for large values of x, Eq. 5.37 takes the following form: 

d 2 f(x) 2m 
dx2 = ^ (*) 
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which has two possible solutions i//(x) oc exp (ikx) or i//(x) oc exp (—ilex), where 
k = y/2m [E — V(oo)\/ti. Any one of these asymptotic forms can be chosen as 
a boundary condition at infinity: the actual choice is determined by the physical 
problem at hand. This situation often appears in so-called scattering problems, when 
one is interested in the behavior of a stream of particles incident on the potential 
from infinity and being registered by a detector on the opposite side of the potential. 
For this reason, wave functions with asymptotic behavior of this kind are called 
scattering wave functions. I will talk much more about this situation in subsequent 
chapters of the book. 

Finally, you need to learn about the continuity properties of the wave functions. 
This issue arises only if the potential V(r) is not everywhere continuous (if the 
potential is continuous, the wave functions are automatically continuous). We 
require that the wave function remains continuous regardless of the discontinuity 
of the potential. The physical foundation for such a requirement can be given as 
follows. A discontinuity of wave function means that its first derivative becomes 
infinite at the point of discontinuity, which creates a whole bunch of problems, e.g., 
the expectation value of the momentum of the particle at this point becomes infinite. 

However, the continuity of the first derivative of the wave function is not neces¬ 
sarily guaranteed. In one-dimensional case, one can show that if the discontinuities 
of the potential only occur in the form of finite “jumps,” the first derivative of the 
wave function remains continuous (provided that the mass of the particle remains 
the same on both sides of the “step in the potential”). To see this one simply 
needs to integrate Eq. 5.37 over an infinitesimal interval surrounding the point of 
discontinuity of the potential, 


Xd+e 



Xd~e 


d 2 \//(x) 

dx 2 


dx = lim 

£—>-0 


d\jr{x) 


dx 


Xd+e 


dty (*) \ 

dx x d -e) 


Xd+£ 

2m f 

lim — / [V(x) - E] \j/(x)dx 
e-K) Tl Z J 

xd~e 


lim 

£—^0 


/ df(x) 


d\)/(x) 

l dx 

Xd+£ 

dx 


xd~s y 


— [( V 2 - Vi) f(x d )s - Ef(x d )e ] = 0 =>■ 


d\jr{x) 


dx 


Xd~\~ 0 


d\//(x) 


dx 


Xd- 0 


(5.38) 


where V\ = V(x^ — 0) and V 2 = V(x^~h0). In some semiconductor heterostructures 
(alternating planar layers of different semiconductors), Eq. 5.37 is sometimes used 
to describe the behavior of charged particles in the so-called effective mass 
approximation. In this approximation the periodic potential of ions felt by electrons 
is approximately taken into account by modifying the mass of the electrons from 
their normal “free electron” value. The new “effective” masses are usually different 
in different materials, and if the discontinuity of the potential occurs due to an 
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electron passing from one semiconductor to another, its effective mass also changes. 
Repeating previous derivation taking into account the possibility of discontinuity of 
the mass, you can derive a generalized derivative continuity condition: 


1 


m\ dx 


1 d\f/{x) 


Xd~\~ 0 


m 2 dx 


(5.39) 


Xd~ 0 


where m\^ are values of the effective mass on both sides of the potential step. 

The position representation allows for a useful and conceptually important gen¬ 
eralization of the idea of probability conservation expressed by the normalization 
condition 2.43. Consider the following quantity: 


m 


= J \#(r,t)\ 


2 d 3 r, 


which yields a probability that a measurement of the particle’s position will find it 
within the integration volume v . Computing the time derivative of this quantity and 
utilizing Schrodinger’s equation, Eq. 5.33, you get 


dP 

dt 


-I 


V (r, t ) 


dV* (r, t) 
dt 


V*(r,t) 


r)'P (r, t) 
dt 


d i r = 


1 

iti 


V* (r, t) 


2m 


X 2 + V(r,t) 


*P(r, t) 


-d> (r, t) 


h 2 9 

- — V 2 + V(r,t) 
2m 


&* (r, t) > d 3 r 


ifi 

2m 


J {V* (/•,t)V¥(r,() -V(r,t)V 2 V*(r,t)}d 3 r. (5.40) 


To proceed you will need the following vector identity: 

V* (r, t) V^(r, t) - V(r, f)V 2 ^*(r, t) = 

V • [V* (r, t) Vi P(r, t) - W(r, t)VV*(r, /)] 

which is easily proved by working it out from the right to the left. What is important 
is that the expression on the right has a form of a divergence of a vector so that 
Eq. 5.40 can be written as 


j j t \'I'(r,t)\d i r+ J y-jd 2 r = 0, 


( 5 . 41 ) 
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where I introduced a vector called probability current density 
ifi 

j = —- [ip (r, 0 W*(r, 0 - ^*(r, t)VV(r, t )]. (5.42) 

One important property of this quantity is that it vanishes for the wave functions rep¬ 
resenting a stationary state if its time-independent part is real. Indeed, substituting 

th (r, t) = exp (— iEt/fi ) ifr ( r) , 


you can see that the product of time-dependent factors yields unity, and, if \fr (r) 
is real, the remaining two terms simply cancel each other. Equation 5.42 can be 
rewritten in a more illuminating form: introducing a velocity operator 



m e m e 


it can be presented as 


j = ^ + 4/ (£40*). (5.43) 

If you do not see an immediate usefulness of bringing out the velocity operator in 
the definition of j (besides a purely aesthetic fact that Eq. 5.43 is more pleasant to 
the eye), let me point out that it highlights the connection between quantum and 
classical concepts of the current density. As you may remember from introductory 
physics course, the current density for any flowing quantity in classical physics can 
be written down as pv , where p is the density of whatever does the flowing (charge, 
mass, etc.) and v is the velocity of the flow. This connection becomes even more 
direct for a free propagating particle with wave function: 

4* (r, t) = A exp (—iEt/fi + ipr/fi) . 

Substituting this wave function into Eq. 5.42 or 5.43, you will find for the quan¬ 
tum,/ : 


j = \M 2 p/m, 

which is an exact reproduction of the classical expression if you identify \A\ 2 with p. 

Using Gauss’ theorem (google it, if you do not remember!), I can rewrite 
Eq. 5.41 as 


J j t \'P(r,t)\ 2 d 3 r = - Jj-ndS 

V E 


(5.44) 


where n is a unit vector normal to surface X enclosing volume v (directed outward). 
The right-hand side of Eq. 5.44 has a meaning of a flux (just like electric field flux 
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in electromagnetism) characterizing the “flow” of probability across a boundary 
encompassing the volume. This equation simply states that the probability “to 
locate” a particle within a given volume decreases if the probability “flows” outside 
of the volume and increases if the flow of probability is reversed. In this sense, this 
equation is the statement of conservation of probability, just like a similar statement 
in electromagnetism would mean conservation of charge, and in hydrodynamics, 
conservation of mass. An alternative expression of this statement can be obtained 
if you drop the volume integration in Eq. 5.41 and introduce probability density 


p(r,t) = |^(r,?)| 2 : 


d 

P (r, t) + V j = 0. 


(5.45) 


This equation is called probability continuity equation, and it looks very much like 
any other continuity equation: in the electrodynamic context, p is the charge density, 
and j is the current density; in hydrodynamics, p is a density of a fluid, and j is the 
mass flux; in thermodynamics, p is local energy density, and j is energy flux; etc. 
While in quantum mechanics this equation does not describe the flow of anything 
material, such as charge or mass, it has very similar empirical significance. Proba¬ 
bility current density, for instance, determines such experimentally observable char¬ 
acteristics as scattering cross-sections or reflection and transmission coefficients. 


5.1.4 Orbital Angular Momentum in Position Representation 

5.1.4.1 Operators 

When I first introduced angular momentum operators in Sect. 3.3.4, I emphasized 
that the importance of the angular momentum is derived from the fact that it 
commutes with the Hamiltonian of a particle in a central field. At that time I 
did not have the tools to prove this fact as well as to study eigenvectors of the 
angular momentum operators in any particular detail. Using position representation 
for these operators, I can eliminate some of those gaps. This representation is 
generated by substituting Eq. 5.17 for position representation of the momentum 
operator to Eqs. 3.50-3.52 with additional understanding that the action of the 
position operator is reduced to mere multiplication by r. This procedure generates 
the following expressions for the Cartesian components of the angular momentum 
defined with respect to some coordinate axes: 



(5.46) 


Ly 


itiz— + itix— 
ox oz 


(5.47) 



(5.48) 
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Fig. 5.1 Spherical 
coordinate system 
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These expressions imply that operators L x ^ z act on wave functions defined in terms 
of Cartesian coordinates x,y, and z of a position vector r. However, Cartesian 
coordinates are not the only way to characterize a position of a point in space. 
Spherical coordinates, for instance, can do the same job, and in some instances, 
we might want to have operators acting on functions 9, cp), where r, 9 , and cp 
are radial, polar, and azimuthal spherical coordinates (see Fig. 5.1). To make sure 
that there is no confusion left, let me reiterate: I am using spherical coordinates to 
describe position dependence of the wave functions in the coordinate representation, 
but I keep using Cartesian coordinate system to introduce components of the vector 
of the angular momentum and respective operators. It is important that the two 
coordinate systems are mutually dependent: the spherical angles 9 and cp are defined 
with respect to the same axes, which are used to define Cartesian components of the 
angular momentum. 

To proceed with my plan, I need to remind you the well-known relations between 
Cartesian and spherical coordinates: 


z = rcos 9 
x = r sin 9 cos (p 
y = r sin 9 sin <p 


(5.49) 

(5.50) 

(5.51) 


and 


V* 2 + y 2 + z 2 


(5.52) 



(5.53) 


<p = arctan . 


(5.54) 
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To make the transition from the operators defined in space of functions f(x, y, z) 
to the operators acting on functions/(r, 6, ip), I shall use the regular chain rule for 
differentiation of the functions of several variables, which in this case takes the 
following form: 


9 

dr 

d 

+ 

d6 9 

+ 

3 <p 

9 

dx 

dx 

dr 

dx dd 

dx 

dip 

9 

dr 

9 

1 

de 9 

1 

dip 

d 

dy 

dy 

dr 

+ 

9^9? 

T- 

dy 

dip 

9 

dr 

d 

| 

de 9 

+ 

dip 

9 

dz 

dz 

dr 

+ 

!kdd 

dz 

dip' 


I will illustrate this transition deriving expression for L z in the spherical coordinates. 
According to Eq. 5.48, 1 need derivative operators d/dx and 3/3 y. Using Eqs. 5.52- 
5.54, as well as Eqs. 5.49-5.51, 1 get 


3r 


= sin 6 cos ip. 


dx A jx 2 + y 2 + z 2 

To compute derivative d6/dx, it is more convenient to transform Eq. 5.53 into 

z 


cos 6 = 


and differentiate it with respect to x: 


V * 2 + y 2 + z 2 


. fl 3 0 zx 

— smO — = - 

dx (x 2 + y 2 + z 2 ) / 


This expression can now be transformed into 


3 0 r 2 cos 6 sin 6 cos ip cos 6 cos cp 

dx r 3 sin 6 r 


Similarly, starting with tan<p = y/x, I find 

13^ y sirup 3 <p sin^? 

cos 2 cp dx x 2 rsin^cos 2 ^ dx rsinO 

Gathering all these results together, I finally have 
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Now I need to repeat these calculations for xdj 3 y contribution to Eq. 5.48: 


3r 


y 


sin 6 sin cp 


■ 

— sint/— = - 


dy yjx 2 + y 2 + z 2 

zy 3 0 cos 6 sin cp 


dy (x 2 + y 2 + z 2 ) 3//2 


1 3 cp 1 


3<p cos cp 


cos 2 <^> 3_y x r sin 9 cos cp dy r sin 0 


3 2 3 ^ 3 ? 3 

x— = rsin 0 cos cp sin cp — + sin 0 sin cp cos 0 cos cp —- + cos <p-—. (5.56) 

3,y 3r 3^ 3<^ 

Finally, combining Eqs. 5.55 and 5.56, 1 am getting my reward for all this hard work 
because the derived expression for L z is so remarkably simple: 


L z = —ifix— + ifiy— 
dy dx 


-ihjr- 

dcp 


(5.57) 


This result justifies going into all these troubles involved in transitioning to spherical 
coordinates. One can also derive similar expressions for v- and y-components of the 
angular momentum, but they are not that pretty: 


/ 3 3 \ 

L x = ifi (sin<p— + cot6cos(p — j 

(5.58) 

/ 3 3 \ 

L y = iti ( — cos cp— + cot 0 sin cp —— ) . 

V 3 0 dcp J 

(5.59) 


The remarkable simplicity of L z expressed in terms of the derivative with respect 

to spherical coordinates is the main reason why it became customary to consider 

- -2 

the pair L z , L as a set of commuting operators, when dealing with the angular 
momentum. Derivation of the expression for operator L 2 in terms of spherical 
coordinates is quite straightforward, and while it is excruciatingly tedious, it does 
lead to a really awesome answer: 


= -h 2 


' 1 3 

sin 6 3 9 



1 3 2 " 

sin 2 0 dcp 2 _ 


(5.60) 


However, in order to appreciate its awesomeness, you might have to google 
“Laplacian operator” unless, of course, you are also awesome and remember how 
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Fig. 5.2 Breaking down a 
classical momentum in 
components 



it looks like in spherical coordinates (in Cartesian coordinates it was defined in 
Eq. 5.30). For your convenience I will present it here: 


1 ^ / 2 ^ \ 1 1 3 / . 3 \ 1 3 2 

r 2 dr y dr) + r 2 _sin0 dd y dd ) + sin 2 0 3 (p 2 _ 


(5.61) 


hoping that you notice that the angular part of the Laplacian (expression in square 
brackets) is identical to — L 2 /fi 2 . And this fact is not left without important conse¬ 
quences. Recall that the Laplacian operator defines the coordinate representation of 
the kinetic energy operator K = —fi 2 y 2 /2m e which now can be written down in 
spherical coordinates as 


- fi 2 3 / ? 3 \ L 2 

K = -- — r 2 — +--. (5.62) 

2 m e r 2 3 r \ dr J 2 m e r 2 

This presentation of the kinetic energy makes it plainly obvious that |^£,L 2 J — 0. 

Indeed, the radial part of kinetic energy commutes with L 2 because they contain 
derivatives with respect to different coordinates, and the angular part is simply 
proportional to L 2 , which obviously commutes with itself. To get an even better 
appreciation of Eq. 5.62, it is interesting to consider a classical kinetic energy 
rewritten in terms of two mutually perpendicular components of the momentum: 
p r , which is normal to the particle’s position vector, and p r , which is aligned with 
it. Taking into account that the momentum is tangential to the particle’s trajectory, 
you can see (Fig. 5.2) that p T = p sin#, where & is the angle between the vector 
of momentum and the position vector at a given point. In terms of these two 
components, the kinetic energy can be presented as 



p 2 sin 2 ft 


e 


P 


2 

r 


2 m e 


2m. 
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Now, let me play a bit with the first of these terms multiplying its numerator and 
denominator by r 2 : 


p 2 sin 2 ft p 2 r 2 sin 2 & 

2 m e 2 m e r 2 

I am sure you recognize now that the numerator of this expression is |rx/?| 2 , which 
is nothing, but the classical angular momentum L = rxp. Thus, the classical kinetic 
energy can be presented as 



L 2 

2 m e r 2 


where the last term “miraculously” reproduces a similar term in quantum mechani¬ 
cal Eq. 5.62. Isn’t it true that physics (and math) work in mysterious ways? 

Now I can fulfill my promise made in Sect. 3.3.4 and prove that the operators 
of the angular momentum commute with the Hamiltonian if the particle’s potential 
energy belongs to the class of central potentials. Actually, Eq. 5.60 makes the proof 
quite trivial: the angular momentum operators in the position representation contain 
only derivatives with respect to angular variables, so that if the potential energy 
V(r) depends only on the radial coordinate V(r) (definition of the central potential!), 
then neither L z nor L 2 affects V(r) so that L 2 V{r)^r (r, 6,(p) = V(r)L 2 ^ (r, 9, (p), 
and the same is obviously true for L z operator. Since I already showed that the 
angular momentum commutes with the kinetic energy, the last remark completes 
the required proof. 


The direct consequence of vanishing commutators 



and [h,L 2 


is that 


the common eigenvectors of L 2 and L z are also eigenvectors of Hamiltonians with 
a central potential, which makes the task of finding these eigenvectors especially 
important. And this is what, without further ado, I am going to do now. 


5.1.4.2 Eigenvectors 

First of all, let me remind you that we are looking for the functions, which represent 

~ 2 

common eigenvectors of operators L and L z . This means that these functions must 
simultaneously obey both equations: 

Lztim (0, (p) = hm^im ( 6, <p) (5.63) 

and 

t 2 flm (#, <p) = h 2 l(l + l)flm (0, <p) ■ (5.64) 

I begin with operator L z whose eigenvectors in the coordinate representation are 
particularly easy to find. First, let me notice that this operator only contains 
derivatives with respect to cp, so that the angular variable 0 plays here the role of 
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“silent” parameter, a constant, as far as operator L z is concerned. In formal language 
it means that dependence of 0 may appear in function \j/i m (Q, <p) only as a factor in 
front of the “main” function dependent only of cp\ 


^ lm {Q,<p) = P™(Q)<$> m {<p). (5.65) 

Substituting this form into Eq. 5.63, you can see that P™ ( 6 ) indeed behaves as a 
constant and can be discarded. The resulting equation for the remaining function 


—ifi 


d&m (vO 
3 cp 


= tim$m (cp) 


has an obvious solution 


(SP) — 


—f= ex P ( im <P) ■ 
V2tt 


(5.66) 


Now consider how function <t> m (cp) evolves when the position vector rotates around 
the axis Z. After one complete rotation, which corresponds to the change of cp by 
2 tt, the position vector returns to the initial position. It would have been weird 
if the wave function would not return to its initial value as well. In a somewhat 
more sophisticated language, it means function <h m (cp) is expected to be periodic 
in (p. This can be only achieved if you allow only for integer values of m: m = 
0, ±1, ±2 - • •. This is only half of the eigenvalues of the operator L z found by 
algebraic methods in Sect. 3.3.4. The eigenvalues corresponding to half-integer 
values of m result in the solutions that change its sign upon rotation by 2n and 
shall be discarded. It does not mean, of course, that half-integer values m have no 
place in quantum theory; it only means that they cannot correspond to eigenvectors 
permitted the position representation. The factor 1 / ^/2jt in Eq. 5.66 ensures that the 
wave function <t> m {cp) is normalized with respect to the inner products defined as 


2tz 2tz 

($mil < l 'rn 2 ) = j , (</0 $m 2 ty) d<p = 2- J exp [i (m 2 -mx )(p]d<p. (5.67) 

o o 

It is obvious that with this definition of the inner product, the functions representing 
the eigenvectors are not only normalized but also orthogonal. The integral in 
Eq. 5.67 is a part of a surface integral carried out over the surface of a sphere, which 
in spherical coordinates has the following form: 


tc 2n 

(Vq| ^2) = J J dOdcpsmQx/r* (0,<p)ty 2 (0,cp) 

0 0 


(5.68) 
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where dOdcp sin 9 is a spherical area element. The remaining integration over polar 
angle 0 defines the inner product for yet unknown functions Pf ( 0 ): 

n 

K 1 \ P h) = Jd6sme[p;' (9)]* P™ (9). (5.69) 

0 


These functions are found by substituting t/r /m (9, cp) 
Eq. 5.64, which results in the following equation: 


Pf ( 9 ) exp ( irmp ) into 


ri a / j\ i 


3 2 


9 3 cp 2 _ 


Pf (9) exp ( imcp ) = l(l+\)P (9) exp (irmp) . 


Carrying out the differentiation with respect to cp and canceling the exponential 
factor results in the following equation for Pf (9): 


1 3 

sin 9 3 9 




sin 2 9 


Pf + /(/ + l)/f = 0. 


Do you see now why I kept both indexes / and m in the notation for Pfl By 
introducing the new variable x = cos 9 , this equation can be rewritten as 


d 

dx 


(l-x 2 ) 


, 2 \ dp 7 


dx 


J2 1 


1(1 + 1 ) - 


1 —X 2 


p 7 = o 


(5.70) 


where I used relation d/d9 = (dx/d9)d/dx = — sin 9d/dx and replaced sin 2 9 
with 1 — cos 2 9 = 1 —x 2 . 

This equation is very well known in mathematical physics as general Legendre 
equation , whose solutions can be presented in the form of associated Legendre 
functions , Pf(x) = Pf(cos9). As is clear from the relation between variables x 
and cos 9 , functions Pf(x) are defined on the interval x e [—1,1], where they are 
orthogonal with the inner product defined as Pf i (x)P l [ l (x)dx: 

l 

r 2(1 +mV 

J giri#= (2 , + m,-,„), 8 »- <5 - 71 » 


You may want to notice that the substitution of the integration variable x = cos 9 
converts this integral into the form identical to the integral in Eq. 5.69. 

The proof of orthogonality of the Legendre functions is fairly standard for 
differential equations of this kind, and you will benefit from learning how to carry 
it out. First, copy Eq. 5.70 for P™: 


d 

dx 


(i 




dJT 

dx 


l\(l\ + 1) — 


m 


2 1 


1 -x 2 


^ = o. 


(5.72) 
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Now, multiply Eq. 5.70 by P™ and Eq. 5.72 by P and integrate the resulting 
expressions from —1 to 1: 


h 


pm± 

'' dx 


sjP m 

v ’ dx 


m 


dx + /(/ + 1) J P n l ' t {x)P , ’ l ' [x)dx— 

2 r p";(x)p?(x) 

J i — - 


dx = 0 


-l 

dPT. 


/ d dP f 

p ™dx (1 ~ A ' 2 ) dx + h{h + V) J 


2 f w w J n 

m I —— -- —dx = 0. 


— JIT 


-l 


Integration of the first terms in both equations by parts yields 


dP m dP m r 

- I (1 -X 2 ) ~^~^ dx + l ( l + 1) / Pl(x)P™(x)dx 

-1 

r pn iS x ) p 

J 1 — - 

-1 

1 1 

/ dp m dP m f 

(l — x 2 ) — -jj-^-dx + l\{l\ + 1) / P™{x)P 1 J l {x)dx 


2 /'w, A 

— ml — -- —dx = 0 


2 f P>l\(x)P",'(x) J A 
— m I —--- —dx = 0, 


and by subtracting these two expressions, you get 


i 

[/(/ + 1) - h(h + 1)] JP7 l (x)py(x)dx = 0. 


It is quite obvious now that for l ^ l\ this equality can only hold if 

l 


J P' l n l (x)P' ! n (x)dx = 0. 
-1 
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The derivation of the normalization coefficient in Eq. 5.71 requires a bit more effort, 
and I shall leave it for the most curious readers to discover it for themselves (google 
it!). You can also notice that in the case of functions with equal / and different m, 
the same line of reasoning results in a different orthogonality condition: 


fP?'(x)P?(x) 

J 1-V 


0 

(/+m)! 


OO 


m ^ m\ 
m = m\ 7 ^ 0 
m = m\ = 0 


(5.73) 


where, again, derivation of the normalization integral lies outside the scope of this 
text. 

The associated Legendre polynomials can be computed using the following 
expression: 


TO = (“I) m (1 ~x 2 ) m/2 (x 2 - O' (5-74) 

where factor (—l) m is known as Condon-Shortley phase and is sometimes excluded 
from the definition of P™(x). Equation 5.74 makes sense and gives non-zero results 
if and only if l and m are integers and 0</ + m<2/4=>— l < m < 1. The 
integer part of this statement is obvious—derivatives of fractional order are not 
something that we can live with at this point. The second part of this statement, 
which reiterates what we have already learned about the relation between these 
two quantum numbers in Sect. 3.3.4, can be understood by noticing that function 
(x 2 — l) is a polynomial of the order 21 and, therefore, can be differentiated no 
more than 21 times before it starts producing zeroes. 

Legendre equation 5.70 is invariant (does not change) if you replace m to — m. 
This means that solutions of this equation characterized by m and — m must be 
proportional to each other. Indeed, one can show that functions defined by Eq. 5.74 
satisfy the following important relation: 


(l-mV 

PT m (x) = (-l) m JjTO- (5-75) 

Finally, combining Eqs. 5.66 and 5.74 and adding corresponding normalization 
coefficients, we end up with a set of functions = Y™(6, cp) known as spherical 
harmonics and defined as 


w *>) = (- ir 


2 /+l (l-m)\ pm 
4n (/ + m )! 1 


(cos 0) e im r 


(5.76) 


In light of the results presented above, the spherical harmonics are obviously 
orthogonal and normalized: 
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(5.77) 


o o 


providing us with the position representation of normalized common eigenvectors 
of operators L 2 and L z . 

I will conclude this section with a brief description of main qualitative properties 
of the spherical harmonics. Numerous identities and recursion relations involving 
associated Legendre functions are well documented and are easily available in 
the literature and on the Internet. However, it is important to have a qualitative 
understanding of how spherical harmonics behave off the top of one’s head. 

The first thing to notice is the symmetry of the spherical harmonics upon 
inversion of the position vector on the sphere with respect to the origin of the 
coordinate system: r —> —r. This corresponds to the transformation of the angular 
spherical coordinates 6 —> n — 9, <p —> (p + n. Upon this transformation 
exp (im(p) —> (—l) m exp (itmp), while the argument x = cos 0 of the associated 
Legendre function, PJ 1 (cos 6 ), transforms as cos 6 —> cos (n — 6) = — cos 0, i.e., 
we are dealing here with inversion x —> —x. Associated Legendre functions have 
a definite parity: they are either even (do not change) or odd (change the sign) 
when their argument changes the sign. This is quite obvious from Eq. 5.74: replacing 
x —> — x does not change the function being differentiated or the factor preceding 
differentiation, while the derivatives with respect to x change their sign with each 
differentiation. It is obvious, therefore, that 


p?(-x) = (-i y +m /?(*). 


(5.78) 


Combining this result with the transformation property of exp ( inup ), we have for 
the spherical harmonics 


Y?(n-6,<p + 7z) = (-1 ) l Y™(e,cp) 


(5.79) 


which means that the spherical harmonics have definite parity: they are either even 
with respect to inversion (for even values of /) or odd, if / is an odd number. 
This behavior is consistent with the fact that the operator of the orbital angular 
momentum L = r x p is invariant with respect to the parity transformation, and, 
therefore, its eigenvectors must also be eigenvectors of the parity operator, i.e., have 
a definite parity. 

It is also important to have a picture of dependence of the spherical harmonics 
upon its arguments. Dependence on azimuth angle (p is trivial: the real and 
imaginary parts of the spherical harmonics oscillate with frequency m, but these 
oscillations are not really significant, unless we are dealing with a superposition 
state comprised of several spherical harmonics with different azimuthal numbers. 
For a single spherical harmonics, relevant properties are often described by its 
absolute values | TJ”(0, cp) | 2 , which lose all dependence on cp. Dependence on polar 


5.2 Representations in Discrete Basis 


143 


angle contained in P™(cos6) is a more interesting matter and is determined by 
values of both quantum numbers / and m separately, as well as by their difference 
/ — m. For instance, for m = /, it is easy to see that 

P\ (cos 9) a sin z 9, 

which takes zero values at 9 = 0, tt (two poles of the sphere) and has a single 
maximum at the equator 9 = 7t/2. The width of the maximum (loosely defined) 
becomes smaller with increasing l (the function decreases more rapidly away from 
equator for larger /). If one likes pseudoclassical mind helpers (I would not even 
call them “analogies”), one can think about a particle rotating around the equator 
with its angular momentum pointing in the polar direction. The larger the angular 
momentum is, the more torque would be required to turn it away from the poles, 
which can be kind of loosely interpreted as a smaller probability for a particle to 
deviate from the equatorial trajectory. But, please, do not take these pseudoclassical 
mumbo jumbo too seriously. 

The case of m = 0 corresponds to the classical angular momentum lying in 
the equatorial plane, while respective spherical harmonics are reduced to regular 
Legendre polynomials: 


These are the only spherical harmonics which do not have zeroes at the poles of the 
sphere (x = =b 1, or 9 = 0 , tt), but it has zeroes between the poles, whose number is 
equal to the orbital number /. Obviously, the number of minimums and maximums 
of these functions is always equal to / — 1 (the only exception is l = 0, when we 
are dealing with a constant). In the case of generic values of m ^ 0, the spherical 
harmonics vanish at the poles, and the number of their nods in the polar direction is 
equal to / — m. In my opinion, mastering the provided information will help you not 
only to have a qualitative feeling for various expressions and phenomena involving 
spherical harmonics but also to make quite an impression at a cocktail party. To help 
you with visualizing these properties, I plotted graphs of the associated Legendre 
polynomials with / = 3 in Fig. 5.3. To make the picture prettier, I normalized all 
functions in the plot to bring their maximum values closer to each other; obviously 
this procedure did not change their qualitative behavior. 


5.2 Representations in Discrete Basis 

Now let’s talk about the representation of abstract vectors in discrete bases. 
Equation 3.39, which represents vector \a) in a basis |/„), establishes a one- 
to-one correspondence between the vector and a set of coefficients a n . These 
coefficients are a discrete analog of functions representing vectors in continuous 
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Fig. 5.3 Graphs of 
associated Legendre 
polynomials with l = 3 and 
0 < m < 3 



basis introduced in the previous section and can be arranged in the form of a 
column vector. Thus, in this case we are representing the abstract vector space by 
a space of column vectors with all the rules of matrix addition and multiplication 
defined for these objects. The Hermitian conjugation in this space was discussed in 
Sect. 2.2.2 and includes transitioning to the adjoint space inhabited by row vectors 
with complex-conjugated elements: 


(«I = <*"!• 

n 


The inner product now becomes a standard matrix multiplication between a row 
vector on the left and a column vector on the right: 


(«I P) = [a\ a 2 


b x 

b 2 




(5.80) 


where b n are coefficients in the expansion of ket |/3) in the same basis, while the 
outer or tensor product |a) (/31 is represented by a matrix formed according to the 
rules of the matrix tensor product, Eq. 2.16: 



a\ 


a\b* a\b* • 

•• cub* •••' 


a 2 


a 2 b\ 

•• a 2 b* ••• 

° nm 

&N 


a^b\ a^b* • 

• • a N b% ‘ • • 







(5.81) 










5.2 Representations in Discrete Basis 


145 


Due to the normalization requirement accepted for the state vectors, the expansion 
coefficients obey the following obvious “sum” rule: 



(5.82) 


n 


which is, again, a discrete analog of Eq. 5.4. 

If a space of states of a given quantum system can be fully described by a discrete 
basis, these states can always be presented in the form of column and row vectors 
reducing the problem to that of a matrix algebra (remember Heisenberg’s matrix 
mechanics—this is where it finds its roots). The main difference, of course, is that 
in standard linear algebra problems, the dimension of the space is always finite, 
while normally spaces of quantum mechanical states have infinite dimensionality. 
This creates a number of technical problems of mathematical nature, but we shall let 
mathematicians to worry about them. At any rate, in most practical applications of 
quantum theory, you wouldn’t have to deal with the entire infinitely dimensional 
space of states. Usually, it is possible to find a way to restrict attention to a 
much smaller (sometimes just two-dimensional) subspace using certain physically 
meaningful assumptions about hierarchy of interactions relevant for the problem 
under study. 

To have you started, consider this simplest of simplest example, which, however, 
often gives students a headache. 

Example 16 (A Basis Vector in Its Own Basis) This example deals with the 
following question: what is a representation of a vector in a basis to which this vector 
itself belongs? In other words, if | Xn) is one of the set of orthogonal normalized 
vectors \x n ) , n = 1,2 • • •, which column vector will represent it in the basis formed 
by these vectors? Even though the answer to this question is almost trivial, it never 
fails to confuse students. Assume, for instance, that n = 1. In this case I have 
|/i) = l|/i)+0|/2)+0|/3)-f -. Obviously, the corresponding column vector is 


T 

0 

0 


Considering h = 2,1 will similarly find that the column representing this vector 
contains unity in the second position and zeroes everywhere else. This pattern, of 
course, repeats itself for all other elements of the basis: any basis vector \xa) is 
represented in the basis it is the element of, by a column, where all components but 
one are zeroes, and the only component in the h -th place is unity. 

Now, if column vectors can represent vector states, it is almost obvious that 
operators must be represented by matrices, in which case the word “act” would 
mean matrix multiplication. Matrices multiply column vectors from the left and row 
vectors from the right. The main question is how to construct a matrix representing 
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a given operator in a chosen basis. To answer this question, I will again rely on the 
completeness relation (its discrete basis reincarnation, Eq. 3.42) for the basis vectors 
| Xn)- Insertion of this relation into |/3) = T \a) yields 

oo oo oo 

J2 I Xn) {Xn \P) = J2 1*4 (Xn\ T \ Xm) {Xm |«> 

n =0 n =0 m =0 

oo oo 

= 23 <Xnl ^ IXm) km |<*) I Xn) ■ 

n= 0 m =0 

Taking into account that coefficients b n are given by b n = (/ n |/3) and coefficients 
are = (/ m |of), I transform the previous equation into 

oo oo oo 

y ] bn\Xn) — y ] y ] iXn \ T I Xm) a m I Xn) • 
n =0 n=0 m=0 


Now, thanks to the linear independence of the basis vectors, I can simply equate the 
coefficients in front of each of | Xn) separately: 

oo 

bn = 53 (Xnl T\Xm)a m . (5.83) 

m =0 


This expression can be rewritten in the matrix form as 

b = Ta 

where I am using bold Latin letters to denote columns and matrices representing 
vectors and operators in a given discrete basis. This means that the required matrix 
representation of the given operator is 

T„m = {Xn\T\Xm) ■ (5.84) 

To illustrate an application of this result, consider the matrix of the operator 
S = |a) (fi\ obtained by the outer product of two vectors. Using Eq. 5.84, you 
immediately find 


Smn — {Xn |^) {ft \ Xm) — tt m b n 

with full agreement with the result obtained using standard matrix definition of the 
outer product, Eq. 5.81. 

The equation for eigenvectors T \a) = X\a) in the matrix representation is 
reduced to the matrix equation: 
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oo 


E 

m=0 


- XCLyi 


which can be rewritten as 


oo 

E t 7 ™ - Um = (5.85) 

m =0 


This is essentially a shortcut notation for the system of uniform linear equa¬ 
tions, which has nontrivial (meaning non-zero) solutions, only if the determinant 
|| T nm — X8 nm \\ vanishes (Cramer’s rule that had already been mentioned earlier). If 
you go back to Sect. 3.2.3, you will find examples of eigenvector and eigenvalue 
calculations with matrices. 

The matrix representation of operators is practically useful only if you know 
how the operator acts on the vectors of the chosen basis. If an operator in question 
is built out of basis vectors, the problem is resolved almost trivially. For instance, 
the projection operators introduced in Eq. 3.41 have a simple matrix representation 
in the same basis in which it is defined: 

Pkm = (Xk\ Xn) ( Xn\ Xm) — ^kn^nm 

which is a matrix with a single non-zero element k = m = n on a main diagonal. 
You can find other examples of matrix representations for operators of this kind, in 
the exercises in this chapter. 

In most cases the issue of finding how operators act on the basis vectors is not 
that trivial. Often it is resolved by using position representation of operators and the 
basis vectors. This approach works especially well for the class of operators, which 
can be presented as a combination of position and momentum operators. This class 
includes many important operators, but not all of them. 


5.2.1 Discrete Representation from a Continuous One 

To illustrate this point, let me consider an example of a single particle of mass m e 
allowed to move freely along a linear segment of finite length L. The probability 
that the particle’s position can be anywhere outside of this segment is assumed to 
be zero. This condition is most naturally expressed in the position representation, 
where Hamiltonian, which contains only kinetic energy term, takes the form of 

- pi ti 2 d 2 

H = — = --. 

2m 2m dx 2 


(5.86) 
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The confinement of the particle inside the specified linear segment is formally 
expressed by the requirement that the wave function i/f(x) representing states of the 
system is equal to zero outside of the allowed interval. The continuity of the wave 
function then requires that it also vanishes at the terminal points of this interval. 
Choosing the origin of a coordinate system at the left end of the allowed interval, 
and assigning coordinate x = L to its right end, I can express the confinement 
conditions by requiring that the wave function vanished at both ends of the interval: 

\j/(0) = f(L) = 0. (5.87) 

It is easy to check that Schrodinger equation 5.37 for a free particle ( V(x ) = 0) has 
two linearly independent solutions V' + W — exp (ikx) and x//-(x) — exp (—ikx), 
where k = \/2mE/fi. Now I need to construct a linear combination of these 
functions obeying the confinement conditions, Eq. 5.87. Beginning with a general 
solution 


f(x) = Ae ikx + Be~ ikx , 

I find that requirement x// (0) = 0 yields A + B = 0, which allows me to write the 
wave function as 


\jf (v) = A sin kx. 

(I used Euler’s formula sinkx = (exp(/lv) — exp(— ikx)) /2z and incorporated 
constant 2 i into coefficient A.) The condition at x = L yields 

A sin kL = 0 

with two possible ways to fulfill it. One is to make A = 0, in which case the entire 
wave function vanishes, and we definitely do not want this to happen. Thus, you are 
stuck with the only other option, namely, to require that 

kL = 7 rn, n = 1,2 • • • . 

This result means that the states of the system considered in this example can only 
be presented by a discrete set of wave functions: 

x\r n ( x ) — ^ s i n k n x 


characterized by parameter 
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with corresponding discrete energy levels 


fi 2 7t 2 n 2 
2 mL 2 


(5.89) 


The appearance of the discrete spectrum is not surprising here, of course, since 
the classical motion in this example is clearly bound. The remaining unknown 
coefficient A remains unknown at this point—it cannot be fixed by the boundary 
condition, which is a fairly typical situation in problems of this kind. I, however, 
have one additional weapon at my disposal—the normalization condition, which 
in this case reads as (using standard definition of the inner product for the square- 
integrable functions) 


J IV'C *)! 2 


—oo 


dx = 


L 



sin 2 k n xdx = 1 =>• A = 



where at the last step, I chose A to be a real positive quantity. This choice while 
pleasing to the eye does not make any difference since normalization condition only 
defines A up to an arbitrary phase factor of the form exp (icp) in alignment with the 
already mentioned general principle that vectors representing quantum states are 
always defined only up to a phase. 

The system of wave functions 


fn(x) = 



(5.90) 


forms a normalized orthogonal basis, which can be used to present any other 
wave function defined on the interval v e [0,L] (one can recognize here just a 
Fourier series expansion for a function defined on a finite interval). This basis can 
also be used to represent various operators acting on such functions. For instance, 
Hamiltonian, Eq. 5.86, in this basis is represented by an infinite diagonal matrix: 


H-m 


tf_2_ 
2m L 



7i mx d 2 
L dx 2 


7i nx 

sin- dx = 

L 


o 


fi 2 7t 2 n 2 2 
2mL 2 L 



7i mx 

~T~ 


nnx 

sin- dx 

L 


fi 2 7t 2 n 2 
2mL 2 


(5.91) 


Now, assume that the particle that you follow is also subjected to an external 
uniform electric field (with all other conditions and limitations intact). This will 
add a potential energy term to the Hamiltonian of the form V(x) = eFx, where e 
is the absolute value of the particle’s charge, presumed to be negative, and F is the 
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magnitude of the field. Now, I want you to try to present the new Hamiltonian of the 
particle: 


- p 2 

H = — + eFx 
2m 

in the same basis of functions \/f n (x) defined in Eq. 5.90. The resulting matrix would 
have the diagonal part given in Eq. 5.91, and the part, which can be written as eFx mn , 
where 


2 / . nnx . nmx 

Xmn = xsm - sin - dx = 


L, 

'll' 

0 


7 t(n — m)x Tc{n + m)x 

cos-cos- 


dx = 


(5.92) 


1 L 

L TC 


x 7 t(n — m)x x tc (n-\-m)x 
-sin-sin 


1 L 

L TC 


n — m 

L 


—f 

n — mj 


. TC {n — m) x 


n + m L 

L 


sin 


dx — 


—f 

n + m J 


TC (n + m) x 
sin- dx 


L 1 

n 2 (n - m ) 2 


cos ■ 


tc {n — m) x 


1 


o tc 1 { n + m) 


cos 


tc {n + m) x 


[(_!)"- _ i] 


4 nm 


{n 2 — m 2 ) 


n ^ m. 


(5.93) 


The diagonal element of this matrix, which is just an expectation value of the 
coordinate, is easily found to be (from the first line of Eq. 5.93) x nn = L/2, which 
has an obvious physical meaning. The total Hamiltonian in the representation based 
on functions defined in Eq. 5.90 is now an infinite nondiagonal matrix: 


/ fi 2 TC 2 n 2 

\ 2 mL 2 



+ ^[(-i)" 


7T Z 


i] 


4 nm 

{n 2 — m 2 ) 2 


(5.94) 


where the second term contributes only to the nondiagonal elements. The electric 
field-related correction to the diagonal elements of the Hamiltonian is just a constant 
and can be eliminated by choosing a different zero level for the energies, for 
instance, by writing the electric field potential as eF{x — L/2). 

This example illustrates a rather general situation: often in order to find a 
representation of an operator in one basis, we have to use its known representation 
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in a different basis. This approach works especially well with observables that 
can be expressed as combinations of position and momentum operators, whose 
representations in continuous bases were discussed above. This approach often 
leads to nondiagonal matrices, and finding the eigenvalues and eigenvectors of 
the operator of interest is reduced to finding eigenvalues and eigenvectors of 
the resulting matrix. In many cases this cannot be done exactly because the 
dimensionality of the resulting matrices can be infinite, but it is often possible to 
truncate them and solve the problem approximately. How this is done practically 
will be discussed in a separate chapter. 


5.2.2 Transition from One Discrete Basis to Another 


Quite often you will find yourself in a situation when having found (or being given) 
matrix representation of an operator in one discrete basis, you will need to find an 
equivalent matrix representing this operator in a different basis. Here I will show 
how this can be done. 

So, let’s assume that you have an operator T and a system of basis vectors 
The representation of this operator in this basis, as we have already established, is 
given by a matrix: 

rp(old) _ I y (old ) I Y I y (old) \ 

± mn \Am | ± \ An I ' 

However, I would like to re-derive this expression in a slightly different way. Let 
me multiply the operator T by two unity operators expressed by the completeness 
relation, Eq. 3.42, formed with the vectors of this basis: 



r=EE unw* 


n= 0 m =0 


T 


^ (old )) | ^ (old) 


V V T^° ld ) I y (old) \ I y (old) 
/ j / j L mn | Am / \An 

n= 0 m =0 


(5.95) 


This representation of an operator in terms of a matrix and operators 
is akin to the expansion of a vector into a linear combination of basis vectors. Now, 


(old) 

Ain 


){> 


fold) 


let me assume that I have another basis 


y 0- 

Am 


new ) \ 

" r 


and I want to relate the matrix of 


the operator in this basis to matrix . To achieve this goal, let me express 
the matrix using Eq. 5.95: 


*r = EE ^ mn k | Xm ){Xn \ Xl 
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This can be rewritten with the help of two new matrices: 


um=[x ( : ld) \x ( r w) ) 


(5.96) 


and 


w.= {x'r 1 un 


as 


oo oo 


rr=EE 

UkmT^Um. 


n= 0 m =0 


(Note the position of the indexes in this expression, which adhere to the regular rule 
for the matrix multiplication.) Now I need to figure out how to construct matrices 
U and U and their relation to each other. Let me first perform complex conjugation 
of each element of matrix £4 m and take advantage of the main property of the inner 
product, Eq. 2.19: 



Thus I can see that matrix U can be obtained from U by complex conjugation and 
transposition, or, expressing this in fewer words, U is a Hermitian conjugate of U : 
U = U\ and the matrix transformation rule can be presented as 


oo oo 



(5.97) 


Now, let me focus on one particular column of matrix U, say, column Zq. Then you 


can easily recognize that quantities \Xn ld) Xi^J are nothing but coefficients of 
expansion of the new basis vector X^j the °ld basis: 


{new)\ = \ (old)) I (old) \ ( new)\ 

X/o I / v I An /\An I A/o I ’ 


n 


which gives a simple recipe for preparing matrix U : find representation of the nth 
vector of a new basis in the old one and use the corresponding coefficients as a n -th 
column of matrix U. Let me illustrate this rule with a simple example. 

Example 17 (Transformation to a New Basis) Consider a Hermitian matrix: 


1 i 
—i — 1 


and rewrite it in the basis of its own eigenvectors. 
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Solution 

First I need to find these eigenvectors, which are given by equation 


" i r 

a\ 

= A 

a\ 

—i 1 _ 

_a 2 _ 


_a 2 _ 


The corresponding eigenvalues are found from 

(1 - A) 2 - 1 = 0 =>■ 
-2A + A 2 = 0 =>■ 

A[ = 0, A 2 — 2. 


Now I can find two eigenvectors: 
For A i =0,1 have 


ci\ + ici 2 = 0 => |0) = 


—r r 

V2 Li. 


where I used |0) as a notation for a normalized eigenvector belonging to Ai = 0. 
You can verify that this vector is indeed normalized. 

For A 2 = 2, the eigenvector equations become 


ci\ + ia 2 = 2a \ =>► 12) = —— 

V2 


What you need to realize now (quite obvious, but always gives students a shudder) 
is that the numbers in these columns are the coefficients in the representation of the 
new basis (vectors |0) and \2)) in terms of the vectors of the old basis. Thus, the 
transformation matrix U can be generated as 


U = 


V2 


1 1 


and its Hermitian conjugate matrix as 


= 4= 

V2 


1 —i 

1 


Plugging these matrices in the transformation rule, Eq. 5.97, 1 get 


"1 -i 

1 i 

"1 1 " 

.1 1 . 

.~i 1. 

j —i_ 






















154 


5 Representations of Vectors and Operators 


1 

1 —i 

<N 

O 


"00" 

2 

1 i 

CN 

1 

o 
_1 


_02_ 


which is exactly what you should have expected: a matrix in the basis of its own 
eigenvectors is diagonal with eigenvalues along the main diagonal. 

Sometimes the transformation rule connecting representation of operators in 
different bases is presented in an alternative form 

oo oo 

47 W) = E E °^n d> uli (5.98) 

n= 0 m =0 


with matrix U n i defined as 


Unl = (^l Xl° ld) )- 

Complex conjugation of Eq. 5.99 yields 


77* 

U nl 



x ( : ew) ) = u b 


(5.99) 


Performing matrix transposition and recalling that complex conjugation plus trans¬ 
position yields Hermitian conjugation, you can see that 


u = d 


and that Eq. 5.98 and Eq.5.97 are equivalent to each other. 

The transformation matrix in the form of Eq. 5.99 appears naturally when one is 
looking for the transformation between components of the same vector written in 
two different bases. Indeed, consider a vector \a) represented in two different bases 
as 


E4 


(old) 


xr) 


£ 


(new) 


(new)\ 

Xi )■ 


The simplest way to express coefficients aj new ^ in terms of coefficients a) old \ which 
is the goal of this exercise, is to premultiply the expression above by the bra-vector 


(new) 


and take advantage of the orthogonality of the basis vectors. This yields 


(new) 


EW w) 

/ 


Xi 


(old)\ (old) 


r 


E Umia { 7 ld) 


=e^; 


(old) 


What is left for me to do now is to show that matrix U defined by Eq. 5.96 is 
unitary. To this end I need to compute the product of two matrices U nm and lf ml , 
using standard matrix multiplication rule: 
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m 


Substituting here Eq. 5.96 I can write 




E(*"l 


{new) 

m 


m 



where I replaced the sum over m with a unity operator because it is again just 
a completeness condition and used the orthonormalization of the vectors of the 
basis to replace their inner product with Kronecker’s delta-symbol. This calculation 
reveals that = U~ l , which is the definition of the unitary matrix. One should 
not be surprised that the transformation of the vector components from one basis to 
another is provided by a unitary matrix. Indeed, such a transformation clearly should 
not change the norm of the vector, and it is, indeed, one of the important properties 
of the unitary operators. 


5.2.3 Spin Operators 


The approach to generating representation of operators outlined in the previous 
section would not work for operators which cannot be built out of momentum and 
position. If, however, you somehow know the eigenvalues of the operator in question 
(most likely this knowledge comes by distilling empirical facts), you can construct 
the matrix of this operator in the basis of its own eigenvectors. Indeed, if |/ m ) is 
the eigenvector of T, corresponding to eigenvalue t m , i.e., T \/ m ) — t m \ Xm)> then 
Eq. 5.84 immediately gives 


Tnm — {Xn \ T |Xm) — hn {Xn 


— tm$n 


Thus, any operator in the basis of its own eigenvectors is presented by a diagonal 
matrix with eigenvalues along the main diagonal. Unfortunately, we often have to 
deal with a set of non-commuting operators, only one of which can be presented by a 
diagonal matrix. The question then remains how to generate a matrix representation 
of other non-commuting operators in the same basis. Fortunately, in all practical 
situations, this problem can be solved if one knows commutation relations between 
relevant operators. I will illustrate this approach by considering representation of 
angular momentum operators in the situation when eigenvalues of the z-component 
of the angular momentum can only take two values -\-ti/2 and —ti/2. Quantum 
numbers m and / introduced in Sect. 3.3.4 take in this case values ±1/2 and 1/2, 
respectively. In Sect. 5.1.4, you saw that the orbital angular momentum, which 
is constructed of position and momentum operators, admits only integer values 
for these numbers. The suggested half-integer values, which are allowed by the 
algebraic properties of these operators, can, therefore, correspond only to a very 
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special angular momentum of electrons not related to their orbital motion. This 
intrinsic angular momentum is known as spin. Leaving more detailed discussion 
of this quantity till later, here let’s just accept its existence and use it to illustrate 
a method of generating matrix representation of operators, which do not have a 
position or momentum representation. 

To distinguish between spin and orbital angular momentum, I will introduce 
special notations for the former designating the respective operators as S x , S y , and S z , 
which have the same meaning as operators L x , L y , and L z of Sect. 3.3.4. Accordingly, 
I will replace the quantum number l with s and m with m s . It is important to realize 
from the outset that, while orbital quantum number / is allowed to take any integer 
values, the value of the respective spin number s is fixed at 1/2 and cannot be 
changed—it is an intrinsic property of electrons just like its mass or charge. Thus, 
the only quantum number which can be used to distinguish between different spin 
states is m s . 

Since m s takes on only two distinct values, there exist only two respective states 
described by eigenvectors of operator S z . Thus, the space occupied by different spin 
states is two-dimensional, and the respective vectors are represented by 2 x 1 column 
vectors, and operators are represented by 2 x 2 matrices. In the basis of its own 
eigenvectors, S z is simply a diagonal matrix: 


= 



while the states take the form of columns 


11 / 2 ) = 


1 

0 


1-1/2) 



(5.100) 


(5.101) 


where I chose to numerate the state corresponding to the positive eigenvalue as first. 
(This choice determines the positions of negative and positive elements in the matrix 
S z and the ones and zeroes in the corresponding columns.) An arbitrary state in the 
space of spin states can be written down as a linear combination of the basis vectors: 


\x) = a 


1 

0 



(5.102) 


The result expressed by Eq. 5.100 is somewhat obvious, and our main task is to 
find matrices realizing a representation of two remaining components of the spin 
angular momentum, S x and S y in this basis. (Operator S 2 in this instance is trivial— 
it is diagonal with identical diagonal elements equal to ti 2 s(s + 1) = 3fi 2 /4, so it is 
proportional to an identity matrix.) 

I begin solving this problem by focusing on operators S± = S x ± iS y , which are 
spin analogs of the ladder operators L± introduced in Eqs. 3.64 and 3.65. Since I 
postulated that spin operators obey the same commutation relations as operators 
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L x ,y, z , I can use the results obtained for these operators, in particular Eq. 3.75 
describing how operator L + acts on eigenvectors of L z . Adapting this equation to 
the case of spin states, I can write 


S+ | s,m s ) = h^-m a (m, + l) \s,m s + 1) (5.103) 

where I took into account that s = 1/2. Applying this equation to the only two 
existing states |l/2) and | —1/2) (I dropped quantum number s, because it never 
changes), I have 


S+ll/2) =0 

S+1-1/2) = * |l/2> (5.104) 

from which you can immediately infer that the matrix representation of S+ is 


S+ = ti 


0 1 
00 


(5.105) 


The matrix representation for the lowering operator S- can be derived in a similar 
way using Eq. 3.76, which yields 

S- 11/2) = h l — 1/2) 

S- 1—1/2) =0, (5.106) 

but it is much faster simply to recall that S- = S+, so that the respective matrix is 
obtained by matrix transposition and complex conjugation of Eq. 5.105: 


S- 



(5.107) 


Now, using the definition of the ladder operators, you can write for S x and S y : 


5,= l(5 + +5_) (5.108) 

5v = (S+ - S-) (5.109) 

which together with Eqs. 5.105 and 5.107 generate the required matrices: 




h ["0 1 ' 
2 1 0 


(5.110) 
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Sy 


h ["0 -i 
2 i 0 


(5.111) 


Equations 5.110 and 5.111 provide the solution to the problem of finding the 
matrix representation for operators, which cannot be reduced to combinations of 
the position and momentum. As you can see, the commutation relations played the 
crucial role in solving this problem. 


5.3 Problems 
Section 5.1.1 


Problem 54 Reproduce calculations leading to Eq. 5.16 for the momentum repre¬ 
sentation of the coordinate. 

Problem 55 Derive Eq. 5.20 generalizing the approach that led to Eq. 5.15 in 
Sect. 5.1.1. 

Problem 56 Derive Eq. 5.24 using the same method which I used deriving Eq. 5.23 
(do not attempt to simply invert the previous equation). 

Problem 57 Assuming that function Xs(q) presenting eigenvectors of operator S in 
the basis of operator Q is given by 

Xs(q) = Ae^-^ 2 , 

find the integral representation of the operator S in this basis. 


Section 5.1.2 

Problem 58 

1. Prove that the momentum operator is “odd” (changes its sign upon the parity 
transformation). 

2. Prove that the operator of the angular momentum is invariant with respect to the 
parity transformation (“even”). 


Section 5.1.3 

Problem 59 Which of the following can be used as wave functions describing 
states of the discrete spectrum: 
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1. e* 2 / 2 

2. (2x 2 -x 4 /3)e-* 2 / 2 

3. Asinkx 

4. B exp (ikx) + C exp (—ikx) 

5. xe~ M 

_ 2 

6. Ae x coskx 

Problem 60 It is known that a potential energy of a quantum particle exhibits a 
finite discontinuity at point x = 0. It is also known that for x < 0 and x > 0, the 
wave functions of the particle are presented by 

^ (x) = j A b 2 + 2 ) ex p Hi* 2 ) * < 0 

(B exp (—o' 2 -x 2 ) x > 0. 

Using continuity of the wave function and its derivative, establish relations between 
parameters A, B, a i, and a 2 . 

Problem 61 Prove the following identity: 

V* (r, 0 V¥(r, i) - ^(r, t)V 2 V*(r, t) = V • [>* (r, 0 W(r, 0 - ^(r, t)W*(r, t)] . 

Problem 62 Compute probability current densities for a particle in the states 
described by the following wave functions: 

1. xf/(x) = A exp (ikz) + B exp (— ikz) 

2. xjr(x) = Acoskx 

3. i/r(r) = y exp (z'kr) + | exp (— ikr ), where r = x 1 +y 2 -\- z 2 

4. ^ (r) = A exp (/ A:* r) + C exp (—i k • r) 


Section 5.1.4 

Problem 63 Derive expressions for operators L x and L }; in spherical coordinates 
presented in Eqs. 5.58 and 5.59. 

Problem 64 Derive the orthogonalization condition for the associated Legendre 
functions with l\ = l 2 and m\ ^ m 2 (Eq. 5.73). Do not attempt to obtain the 
normalization coefficient. 

Problem 65 Consider a function of polar and azimuthal angles 6 and cp defined as 
x// ( 6, cp) = sin 6 (1 — cos 0) cos cp. 

1. Normalize this function. 

2. Present this function as a linear combination of spherical harmonics Y™ ( 0,(p ). 
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-2 

3. If the observables presented by operators L and L z are measured when a particle 
is in the state presented by this wave function, what would be the possible 
outcomes and their probabilities? 

4. Find the expectation values and uncertainties of these observables in this state. 
Problem 66 Repeat the previous problem for a following function: 

3 9 

xj/ (0,(p) = - sin 20 exp (—icp) + 2 sin 6 sin 2 cp, 

but do not attempt to normalize it before rewriting it as a combination of spherical 
harmonics. 

Problem 67 Find the coordinate representation for lowering and raising ladder 
operators introduced in Sect. 3.3.4, and using found expressions, find Y\ ( 6 , cp) and 

Yr‘(o,<p). 

Problem 68 Find all zeroes of the angular probability distribution for a particle in 
angular states described by spherical harmonics Y 3 ( 6, (p ), Y\ ( 6, (p ), Y\ ( 6,cp ), and 

Problem 69 Find energy values for the system described by Hamiltonian: 


H = 


L x + L y 
2/i 



Section 5.2 


Problem 70 Using eigenvectors |1) and |2) of matrix 

0 i 
—i 0 

as a basis, construct the matrix representation of operators | 1 ) ( 1 | and | 2 ) ( 2 | and 
verify the closure (or completeness) condition: 

| 1 )( 1 | + | 2 ) ( 2 | =/ 


where I is the unity operator. 

Problem 71 Find the matrix of the operator: 

^ — 101) (01 I + |02) (02 | + |03) (03 | - 

i 101 ) (021 - 101 ) (03 | + *| 02 ) (011 - 103 ) (011 

in the basis formed by orthonormalized vectors | 0 i), | 02 )» an d | 03 ). 
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Problem 72 Write down expressions for spherical harmonics with orbital quantum 
number 1=1. You can consider them as a basis in the subspace of eigenvectors of 

operator L belonging to this eigenvalue in the sense that any linear combination of 
them will also be an eigenvector of this operator. 

1. Prove that this is indeed the case, i.e., that any linear combination of spherical 

harmonics with l = 1 represents an eigenvector of L with the same eigenvalue. 

2. Find the matrix representation of operator L x in this basis, and using the obtained 
matrix, find the representation of this operator’s eigenvectors in this basis. 

3. The found eigenvectors can also be considered as yet another basis. Find the 
representation of operators L x and L z in this basis. 

Problem 73 Present operator id/dx in the basis of spherical harmonics with 1=1. 

Problem 74 Consider the matrix: 


A = 


' 0 0 - 1 " 
0 1 0 
-10 0 


Transform this matrix to the basis of its eigenvectors. Verify that the elements along 
the diagonal of the resulting matrix are eigenvalues of A. 

Problem 75 Consider two matrices: 


and 


A = 


1 i 1 
-i 0 0 
1 00 


B = 


'3 0 O' 
0 1 i 
0 -i 0 


Rewrite matrix B in the basis formed by eigenvectors of matrix A. 


Section 5.2.3 

Problem 76 Using the same approach, which was used in Sect. 5.2.3 for spin 1/2, 
find the matrix representation for the operators of the spin s = 3/2. Hint: What is 
the dimension of the space that contains the vectors representing states of this spin? 








Part II 
Quantum Models 


In this part of the book, we will play with some of the toys, which physicists created 
in order to get better insight into a variety of new and unusual properties exhibited 
by real systems obeying laws of quantum mechanics. These toys rarely represent 
real systems and this is why we call them models. Still, in many instances they 
provide necessary first experience and conceptual understanding required to deal 
with reality in all its complexity. Using models allows us to focus on those properties 
of real world, which appear to be of the most significance at least for the class of 
problems we are interested in. One can think of models in quantum mechanics as of 
impressionist or postimpressionist paintings, when instead of painstaking attention 
to details, the main focus is on capturing “the essence” of the object, whatever 
this might mean. Quantum mechanical models develop physical intuition about the 
phenomena under study and can often be used as a first iteration of an approximation 
scheme yielding more accurate and quantitative description of nature. 



Chapter 6 

One-Dimensional Models 


Check for 
updates 




One-dimensional models might appear in quantum mechanics in two, in a way, 
diametrically opposite situations. In one case, you can pretend that the potential 
energy of a particle changes only in one direction, such as a potential energy 
of a uniform electric field. Classically, this would mean a motion characterized 
by acceleration in one direction and constant velocity in perpendicular directions. 
By choosing an appropriate inertial coordinate system, you can always eliminate the 
constant velocity component and consider this motion as straightlinear. Quantum 
mechanically, this situation has to be described in the coordinate representation, 
and the respective coordinate wave function can be presented in the form 


f (r) = e Kk * x+k y y) <p(z ) 


( 6 . 1 ) 


where Z-axis of the coordinate system is arbitrarily chosen to lie along the direction, 
in which the potential energy changes. The behavior of the wave function in 
two perpendicular directions (X and Y ) is that of a free particle with conserving 
components of momentums being p x = fik x and p y = fik y . Substituting this 
expression into Eq. 5.35, where the potential energy is taken to have the form of 
V (r) = V(z), and canceling the exponential factors on both sides of the equation, 
you will end up with the following one-dimensional equation: 



( 6 . 2 ) 


where 



2m e 2 m e 
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6 One-Dimensional Models 


Fig. 6.1 A schematic of a 
semiconductor 
heterostructure, in which the 
motion of electrons in the 
direction perpendicular to the 
planes of the layers can be 
described by the 
one-dimensional model 



is the contribution of the motion in z direction to the total energy of the system E. 
Values of p x and p y are determined by the initial state of the system, which may 
or may not be one of the eigenvectors of the Hamiltonian with given values of p x 
and p y . In the latter case, the particular solution of the time-dependent Schrodinger 
equation satisfying the initial conditions will be given by the linear combination of 
the functions presented in Eqs. 6.1 and 6.2, but for now I will focus only on the 
stationary states, which correspond to initial conditions with definite p x and p y . 

While for a long time this type of one-dimensional model was used mostly 
in classrooms to illustrate basic quantum effects to unsuspecting students, the 
technological advances of the last 50 years made this model quite relevant as 
a stepping stone to understanding properties of practically important artificial 
structures made of planar layers of several different semiconductors arranged in an 
alternating order (see Fig. 6.1). It can be shown (way above your pay grade though) 
that the motion of electrons in such structures can be approximately described by a 
potential energy, which only changes in the direction perpendicular to the plane of 
the layers (growth direction). 

The second situation, in which the one-dimensional model can have at least some 
relation to reality, is the case of potentials confining the motion of the particle in all 
directions but one. One can imagine a particle moving inside of a cylindrical tube 
with impenetrable walls. The motion perpendicular to the axis of the cylinder is 
characterized by discrete allowed values of energy (I will show it later, for now 
you will have to trust me on that), and if the radius of the tube is small enough, the 
distance between adjacent energy levels can be sufficiently large that for all practical 
purposes only one of this energy levels can be taken into account. In this case, the 
transverse (as in perpendicular to the axes of the cylinder) motion is completely 
“frozen,” and one is again left with pure one-dimensional motion. In all cases, we 
are dealing with the Schrodinger equation in the form of Eq. 6.2, which is the main 
object of study in this chapter. 
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6.1 Free Particle and the Wave Packets 


Before taking on quantum states of electrons in one-dimensional piecewise poten¬ 
tials such as wells or barriers, it is useful to consider the simplest quantum 
mechanical model—a freely propagating, i.e., not interacting with anything, parti¬ 
cle. In classical physics, as we all know, such a particle would move with a constant 
velocity, v , and can be characterized by conserving kinetic energy K = mv 2 / 2 and 
momentum p = mv. In quantum mechanics, states of a free particle are the solution 
of the Schrodinger equation with zero potential 


ifi 


3 I*) 

3 1 



(6.3) 


It is quite easy to see that the stationary states of a free particle are eigenvectors of 
the momentum operator: 


|tp) = exp 



\p > 


(6.4) 


where | p) is defined by P \p) = p \p). Substitution of Eq. 6.4 into Eq. 6.3 yields 

„2 


E„ = 


2m P 


(6.5) 


which is an expected classical relation between energy and momentum of a free 
particle, often called dispersion relation. Historically, Schrodinger equation was 
devised to make sure that the quantum theory respects this relation between energy 
and momentum. Indeed, in the position representation, the Schrodinger equation 
becomes 


34* (r, t) fi 2 V 2 T , . 

ifi —-- =-tp (r. t) 

3 1 2 m e 

with stationary state solutions of the form 


'P ( r , t) = - E— exp (-i-ft + i^-E] 

( 2 jt tif 2 V h h ) 


( 6 . 6 ) 


(6.7) 


where I used 8 -function normalized eigenvectors of momentum operator given in 
Eq. 5.21. Now, one can argue that the Schrodinger equation contains the first-order 
time derivative because the dispersion relation, Eq. 6.5, is linear in E , while the 
derivative over coordinates must be of the second order to reproduce the term p 2 in 
Eq. 6.5. Further, one can argue that since the time derivative in the Schrodinger 
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equation is only of the first order, the corresponding wave function must be 
represented by a complex exponential function rather than by a real trigonometric 
function, which, in turn, makes the factor i in front of the time derivative necessary 
to compensate for the similar factor in the argument of the wave function. 

The wave function of the form given in Eq. 6.7 has been conceived at the early 
days of quantum mechanics as a mean to reconcile particle and wavelike properties 
of quantum objects. However, it was clear from the very beginning that regardless of 
the chosen interpretation (statistical due to Bom or Schrodinger’s pilot wave), there 
are several problems with assigning this function to represent quantum states of real 
particles. First, its absolute value is uniform in space, which can hardly represent an 
actual localized particle regardless of the chosen interpretation. Also, the motion of 
the wave represented by Eq. 6.7 is characterized by phase velocity v p h = co/k = 
E/p = p/(2m e ), which is half of the corresponding classical velocity v c i = p/m e 
making it difficult to associate it with the motion of a particle. 

To get around this conundmm, it was suggested that actual states of the 
particles (in either interpretation) are presented not by stationary states but by their 
superposition, which still will solve the Schrodinger equation 6.6. It is quite easy to 
show that by choosing an appropriate superposition, it is possible, for instance, to 
localize a particle within an arbitrarily small region solving at least one of the listed 
problems. To see how this comes about, consider a wave function at time t = 0 and 
form a superposition of the form 



( 6 . 8 ) 


In Sect. 5.1.1, it was shown that Eq. 6.8 can be inverted to yield 



(6.9) 


so that by choosing an appropriate A (p), I can “generate” an initial (t = 0) wave 
function with an arbitrary degree of localization. Now all what I need is to consider 
the time dependence of this initial superposition to see if other problems outlined 
above can also be circumvented by the superposition states, which, in the case of 
free propagating particles, are often called wave packets. 

Here I will focus on just one particular example of the wave packets, which 
despite its relative simplicity will help me to illustrate most of the relevant ideas. 
First of all, I will simplify the consideration by limiting it to the case of one¬ 
dimensional motion described by a wave function, which depends on a single 
coordinate, say, z. Integrals in Eqs. 6.8 and 6.9 are in this case reduced to one¬ 
dimensional form 


oo 



—oo 


( 6 . 10 ) 
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oo 

MPz) = y= J (z) exp (-^) • (6.11) 


Next, I will assume that the initial state of the particle is described by function 


t (z) = C exp 


(z-z ) 2 


exp 


m 


(2A Zo Y_ 

where constant C is found from the normalization condition 


( 6 . 12 ) 


OO OO 

J \ty (z)| 2 dz = |C| 2 J 


exp 


(z-9 2 


dz = 


2 (Az 0 ) 2 

|C| 2 V2Azo J exp (—x 2 ) dx = \C\ 2 Azo V2 tt = 1 


C = 


1 


V AzoV2jt 


In the course of computing the normalization integral, I introduced a new 
integration variable x = (z — z)/ ^ V2Azo^ and used a well-known integral 

f ^ exp (—v 2 ) dx = y/n. Thus, my initial state is represented by the normalized 
wave function, where the amplitude of the plane wave exp (ip z z/fi) is modulated by 
the so-called Gaussian function 


t(z) = 


Y ~KzY/2n 


exp 


(z - zf 

(2AzoY 



(6.13) 


The probability distribution corresponding to this wave function is peaked at z = 
z and falls off from its maximum value as z moves away from z- Parameter Azo 
determines how fast the decrease of the probability takes place: the larger Az 0 , the 
larger deviation from z is required to decrease the probability density by e. Varying 
Azo, one can control the degree of particle localization—a smaller Azo corresponds 
to better localized particles (see Fig. 6.2). Formally speaking, one can define z and 
Azo as expectation value and uncertainty of the coordinate in the state described by 
this wave function. Indeed, I can easily compute 

















170 


6 One-Dimensional Models 


Fig. 6.2 Normalized 
Gaussian wave functions with 
different values of the width 
parameter Azo: with 
decreasing Az 0 the function 
narrows, while its maximum 
grows such that the total area 
under the curve remains equal 
to unity 



(z) 



(z - z ) 2 
2 (Azo) 2 



OO 

J (xVlAzo + zj exp (~x 2 ) 


dx = z 


where I took into account that normalization integral computed earlier is equal to 
unity and the fact that the integral of an odd function over a symmetric interval is 
zero. The uncertainty takes a bit more work: 


OO 

(z2)= sW ,, “ exp 


-\2 


(z - z) 


(Azo) 2 

\/2Az 0 + z) exp (—x 2 ) dx = 


2 (Azo) 2 


OO 

J x 2 exp (—v 2 ) dx = z 2 + 


(Azo) 2 


where I used another well-known integral x 2 exp (— x 2 ) dx = ^Ar/2. Subtract¬ 
ing z 2 from (z 2 ), you can convince yourself that Azo is, indeed, the uncertainty of 
the coordinate. 

It shall be noticed that these arguments do not contradict to Schrodinger’s pilot 
wave interpretation, according to which the wave presented by the wave packet is a 
real material object accompanying a particle and whose width defines the degree of 
particle localization. 

Now I can find the appropriate amplitudes A(p z ) in Eq. 6.10, which would 
reproduce the wave function given by Eq. 6.13. Substitution of this equation into 
Eq. 6.11 yields 
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This was a long calculation, but it is worth the efforts to carefully peruse it. 
Some of the tricks that I used in its course were the substitution of variable 
v = (z — z) / (2Azo), presenting an expression of the form a 2 + 2 ba as a complete 
square, a 2 + 2 ba + b 2 — b 2 = (a + b) 2 — b 2 , and finally the fact that integral 
dxexp (x — x 0 f] still equals to yfn regardless of the value of xq. Before 
continuing, I will set z = 0, which amounts to the choice of the zero of the 
coordinate z, and introduce new parameter A p = fi/ (2Azo). Then the expression 
for A(p z ) becomes 


MPz) = 


V^p(2n) l/4 CXP 


c Pz zlA 

(2A p) 2 


(6.14) 


and it is easy to verify (do it!) that as expected d(/?)| 2 dp = 1, while (p z ) = p z , 
and parameter A p determines the uncertainty of the particle’s momentum. Recalling 
the definition of this parameter in terms of the uncertainty of coordinates, you 
can see that these two parameters obey the minimum version of the Schrodinger 
uncertainty principle: 
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ApAzo = (6.15) 

This is the special property of the Gaussian distribution: for all other initial states, 
the product of the uncertainties would be larger than fi/2. 

Having found A(p z ), I can now find the time dependence of the initial wave 
function by considering the superposition of the stationary states at an arbitrary 
time t\ 
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which at t = 0 is obviously reduced to the function given in Eq. 6.12. I begin 
evaluating this integral, again, by introducing a dimensionless variable 

_ Pz-Pz 
X 2A p 


and transforming this integral into 
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Before continuing, let me brush up this expression a bit, first, by replacing p z /m, 
which corresponds to the classical velocity of the particle with momentum p z with 
v gr , and second by introducing the notation 
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(6.16) 


where in the second expression I replaced A p with Azo using Eq. 6.15. As a result, 
the expression for the wave function now takes a somewhat less cumbersome form: 


^(z, 0 = 


2^/Ap 


exp I —it 


, P: 
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V2 Ttfl (27r)^ 4 
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Performing integral over v (using all the same tricks as before, substitution x = 
ax and completion of the square), I will obtain the final expression for the wave 
function tp (z, t ) 


^(z, t) = 


1 

- :- - 777 exp 

a^/Azo (2n) x/ 

where at the last step I used Eq. 6.15 to replace A p with Azo- Not surprisingly, at 
t = 0 Eq. 6.17 is reduced to ^(z) as given by Eq. 6.13. 

It is quite educational to inspect various factors in this expression separately. The 
last factor is a regular plane wave with a wave number determined by the expectation 
value of the momentum p z and corresponding frequency cb = p 2 z / (2 This wave 
propagates with standard phase velocity v p h = fico/p z , but its amplitude, defined 
by the second exponential factor, is also time and coordinate dependent. For any 
given instant t , there is coordinate z m ax = v gr t, when the amplitude is the largest, 
and decreases when z deviates away from it in any direction. One can say that the 
amplitude factor modulates the initial plane wave turning it into a wave packet more 
or less localized within a finite coordinate region. This localization region obviously 
changes its position with time as is evident from the definition of z m ax , and this 
motion of the localization region occurs with velocity v gr = z m ax/ 1. This velocity is 
called group velocity of the wave packet because it characterizes the motion of the 
entire group of waves participating in the superposition forming the packet, while 
the phase velocity describes the motion of each separate wave component of this 
superposition. These two velocities are different because the phase velocity v p h = 
p/2m e depends on the momentum p and is, therefore, different for each member of 
the group. To illustrate all these points, I plotted the real part of the wave function 
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Fig. 6.3 The real part of the wave function representing a wave packet as a function of the 
coordinate for two different instances. You can see suppression of oscillations away from the main 
maximum as well as the displacement of the main maximum. The time interval is chosen to be 
equal to the period of the oscillating factor so that the magnitude of the main maximums remains 
the same for both instances. For other time intervals, it does not have to be the case because the 
decrease of the cos function can damp the maximum’s magnitude. Also, this plot does not account 
for the fact that the parameter a in Eq. 6.17 is complex-valued and depends on time. See discussion 
of the role of this parameter further in the text 


presented in Eq. 6.17 for two distinct time instances as shown in Fig. 6.3. It is easy 
to see that the expression for the group velocity v gr = p/m e can be obtained from 
the dispersion relation E(p) = p 2 /2m e of the free particle as 


Vn r — 


dE 

dp z 


Pz=P 


(6.18) 


Equation 6.18 can also be generalized to the case of three-dimensional propagation, 
in which case the derivative is replaced by a gradient v gr = VE(p)\ p= - and also 
to more exotic cases of particles whose dispersion relation is different from Eq. 6.5. 
If you are wondering where on earth you can find free particles with dispersion 
different from standard quadratic form, here are two examples for you: (1) relation 
between energy and momentum of relativistic particles is E = y / m 2 e c A + p 2 c 2 
and (2) electrons in semiconductors can be in many practically important cases 
approximated as “free” particles with modified dispersion relation E(p). In general, 
the picture of a wave packet propagating with the group velocity can be generalized 
to non-Gaussian wave packets as long as they can be described by a momentum 
wave function A(p z ) with a single and relatively narrow maximum. 

So far, the behavior of the wave packet appears to be consistent not only 
with traditional Copenhagen interpretation of quantum mechanics but also with 
Schrodinger’s pilot wave picture. However, when discussing the role of the ampli¬ 
tude modulating factor in Eq. 6.17, I so far ignored an obvious “elephant in the 
room,” which makes this discussion somewhat more nuanced. The parameter a, 
which sits “quietly” in the denominator of the modulating factor (as well as in a 
normalization pre-factor), is complex-valued and time dependent. The first of these 
circumstances makes the expression 
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B(z, t) 
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(6.19) 


not quite a pure amplitude because it also has a phase attached to it. This phase, 
however, is of little interest, and in order to focus on the actual amplitude part of this 
expression, I will consider its squared absolute value, \B(z, t)\ 2 , which, of course, 
coincides with |T^(z, t) | 2 and in the Copenhagen interpretation yields the probability 
distribution P(z, t) for the coordinates of the particle at any given time in the state 
described by the wave packet tp (z, t ): 
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( 6 . 21 ) 


( 6 . 22 ) 


Substitution of Eqs.6.21 and 6.22 to Eq. 6.20 converts the expression for the 
probability density in the following nice looking form: 
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Comparing this result with Eq. 6.13, it becomes clear that z m ax — v g t is the 
expectation value of the coordinate in the state described by the wave packet, while 
Az represents its uncertainty. Rewriting Eq. 6.21, 1 can present this uncertainty in a 
more illuminating form: 


Az = Azo 



fi 2 t 2 

Am 2 (Azo) 4 


(6.24) 


which shows that the localization range of the wave packet increases with time 
with the rate (roughly defined as derivative d(Az) 2 /dt 2 ) inversely proportional 
to the initial uncertainty Azo- In other words, the tighter you try to squeeze your 
particle into a smaller volume, the faster the localization volume of the particle 
increases with time. This phenomenon of the wave packet spreading is what kills 
Schrodinger’s pilot wave interpretation: the broadening of the wave packets would 
make such a pilot wave unstable. To get an intuitive feeling for how fast this 
spreading takes place, assume that an electron is initially localized in a region of 
atomic dimensions with Azo — 10 -10 m. Substituting the values of the Planck 
constant and the electron’s mass into Eq. 6.24, you will get for Az: 

Az = 10 _1 Vl + 1.32 x 10 33 / 2 m, 

which reaches the value of 10 3 m in just about 3 ms! 

This essentially completes the discussion of the free particle wave packets, but 
I cannot pass an opportunity to play with the Heisenberg picture whenever it is 
possible, and this is one of the simplest situations to showcase it. You can consider 
it as a reward to you for being such a good sport and wading with me through the 
tedious analysis of the Gaussian wave packet. 

Recalling that the Hamiltonian of a free particle is just 



you can easily derive the Heisenberg equations for the components of the position 
and momentum operators 
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0 

with obvious solution 
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(6.25) 
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where r 0 and Pq are as usual Schrodinger picture’s operators setting initial condi¬ 
tions in the Heisenberg picture. Assuming that the particle is in some arbitrary state 
\x), which does not change with time in the Heisenberg picture, I can immediately 
derive for the expectation value of the position operator: 

(p 0 ) 

(r) = (?o) + 

m 

which is, of course, a three-dimensional version of the expression for the expectation 
value of the coordinate found from Eq. 6.23 with zo set to zero. Now, squaring 
Eq. 6.25 I get 


r 2 = r\ + f|r 2 + — (r 0 P 0 + P 0 r 0 ) . 
m z m V / 

Using position representation for the operators fo, Po , and Eq. 6.13 to represent state 
|/), you can demonstrate by direct computations that 

(x\r 0 P 0 +P 0 r 0 |/) = 0 

so that one has for the uncertainty of the position Ar 2 : 

Ar 2 = A tj, + — 1 \ (6.26) 

m l 

where A p 2 is again the uncertainty of the momentum computed with an initial state 
of the particle. If this initial state is Gaussian, and limiting Eq. 6.26 to just a single 
coordinate, you can use Eq. 6.15 to replace the momentum uncertainty with Azo, 
which will yield Eq. 6.24 for the spreading of the wave packet. In addition to this, 
however, Eq. 6.26 demonstrates that the phenomenon of wave packet spreading is 
not limited only to one-dimensional Gaussian packets and is a general feature of 
free propagation of quantum particles. 


6.2 Rectangular Potential Wells and Barriers 

6.2.1 Potential Wells: Systems with Mixed Spectrum 

The first important model, which I am going to introduce in this section, is 
characterized by a potential profile shown in Fig. 6.4, and which can be described as 


V( Z ) = \ V - M <d/2 

l V b |z| > d/2 


(6.27) 
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Fig. 6.4 Rectangular potential well 


where I assumed for concreteness that > V w . Such a potential profile is called a 
potential well. If one chooses to count the energy from the bottom of the well, then 
V w —> 0, and V& -> V& — V w . The energy levels in this potential must be separated 
in two different regions V w < E z < V h and E > ]4 with distinctly different types of 
behavior (states with E < V w do not exist). In the former case, a classical particle 
would have been confined between the two “walls” of this potential well at |z| = 
d/2 bouncing back and forth, while in the latter, classical motion is unbounded (the 
particle can be anywhere along the Z-axis). As was already discussed in Sect. 5.1.3, 
two different types of classical behavior translate in to different quantum behaviors 
as well. 

Bound States: Discrete Spectrum 

I will begin with spectral region V w < E z < \ 4, which corresponds to bound 
classical motion, and where you should expect to see discrete spectrum of energy 
eigenvalues. The potential we are dealing with is a piecewise continuous function 
with finite jumps at z = ±d/2. For the range of energies under consideration, 
spatial regions defined by |z| > d/2 are classically forbidden. Therefore, as it was 
discussed in Sect. 5.1.3, Eq. 5.37 must be complemented by the boundary conditions 
requiring that the wave function vanishes at z —> ±oc and by continuity conditions 
at \z\ = d/2. 

However, before I start dirtying my hands and digging into the boring business 
of actually writing down the wave functions and matching the boundary conditions 
and all that, I want to play with the problem a little bit more and see if I can 
make this task a bit less boring. The nice thing about this particular potential is 
that it is symmetric with respect to inversion of coordinate z: V(— z) = V(z), 
and I hope that you recognize here your old acquaintance from Sect. 5.1.2 —the 
parity transformation, IIV(z) = V{— z). And not only the potential is symmetric, 
but the boundary conditions are also symmetric: cp (z) —> 0, when |z| oo. 
Since the kinetic energy was shown earlier to be always parity invariant (does not 
change upon the parity transformation), you can confidently conclude that the entire 
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Hamiltonian of this system is symmetric with respect to this transformation. And 
as I have already explained in Sect. 5.1.2, it means that the Hamiltonian commutes 
with the parity operator, ft, so that the wave functions representing eigenvectors 
of this Hamiltonian also represent eigenvectors of ft. Wherefore, solutions of the 
Schrodinger equation 5.37 with potential given by Eq. 6.27 can be classified into 
even (<p(— z) = (p{z)) and odd (<p(—z) = ~(p{z)) functions with the immediate 
consequence that you only need to deal with boundary and continuity conditions 
at z > 0. Indeed, the definite parity of the solutions, even or odd, ensures that the 
conditions for z < 0 are satisfied simultaneously with those at z > 0. Here is the 
power of the symmetry to you: I just cut the number of equations to be solved to 
satisfy the continuity conditions by half without even breaking a sweat! Using the 
symmetry arguments for such a simple problem might seem a bit as an overkill—it 
is not too difficult to solve it by simply using the brute force. I still wanted to show 
it to you so that you would be better prepared to understand the implications of 
symmetry in more “sanity-threatening situations.” 

Because of the discontinuity of the potential at |z| = d/2, solutions for intervals 
|z| < d/2 and |z| > d/2 must be found independently and stitched afterward using 
continuity conditions. 

1. |z| < d/ 2. The Schrodinger equation for this interval takes the form 


d 2 (p 
dz 2 


2 m e 

~W 


(E z — V w ) (p (z) 


where E z — V w > 0. This equation is similar to that of the free particle with 
positive energy E z — V w , and its most general solution has, therefore, the form 


cp(z) = Ae lkz + Be lkz = A sin kz + B cos kz 


where 


k = V2m e JE Z - V w )/h (6.28) 

is a real quantity. The choice of the exponential or trigonometric functions to rep¬ 
resent this solution is a matter of one’s taste and/or convenience: the expressions 
are equivalent with A = i(A — B) and B = A + B. However, since we know that 
(p{z) must have a definite parity, the trigonometric form is more convenient to 
take advantage of this insight. Indeed, in order to generate an even solution I can 
simply make A = 0, while an odd solution is obtained by choosing B = 0: 

(p e (z) = B cos kz, |z| < d/2 (6.29) 

<p 0 (z) = Asinfe, |z| < d/2. (6.30) 

Notation for the remaining coefficients is irrelevant, so I dropped the tildes above 
the letters. 
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2. |z| > d/2. In this case, the right-hand side of the corresponding Schrodinger 
equation 


d 2 (p 
dz 2 


V b -E z )cp(z) 


is positive for the range of energies under consideration ( Vb~E z > 0) so that the 
general solution of this equation is given by 


<p(z) = Ce KZ + De~ KZ 


(6.31) 


where now 


k = ^/2m e ( V b - E z )/h (6.32) 

is a real quantity. As has already been mentioned, the solution of the Schrodinger 
equation in the region z > d/2 must vanish at z —> +oo (Eq. 5.36). The function 
presented by Eq. 6.31 satisfies this requirement only if the exponentially growing 
term is gotten rid of, which I achieve by simply requiring that C — 0. Thus, the 
wave function for z > d/2 becomes 

<p(z) = De~ KZ , z > d/2 (6.33) 

for both even and odd solutions. For negative values of coordinates z < —d/2, 
Eq. 6.33 would produce for even and odd solutions correspondingly: 

cp e {z) = De KZ (6.34) 

(p 0 (z) = —De Kz . (6.35) 

Before continuing with stitching the wave functions at z = d/2, let me point out 
that the solutions in the classically allowed region \z\ < d /2 are presented by 
oscillating functions. Upon crossing to the classically forbidden region \z\ > d/2, 
the oscillating character of the solutions turns into a monotonic decrease. This 
example illustrates generic properties discussed in Sect. 5.1.3 and can be used to 
formulate a general rule of thumb applied to any piecewise constant potential: in 
the classically allowed regions, the wave function is represented by combination of 
trigonometric function, while in classically forbidden regions, the solution is given 
by combination of exponential functions with a real argument. However, I need to 
warn you to pay attention to the fact that I was able to eliminate the exponentially 
growing terms in Eqs. 6.33-6.35 only because the classically forbidden region 
extended all the way to positive or negative infinities. If, as it might happen in 
certain problems, the potential would have another jump and a classically forbidden 
region would have crossed over to a classically allowed region, you would have to 
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keep both growing and decreasing exponential functions because the conditions at 
infinity can only be used within a region of coordinates extending, well, to infinity. 

Equation 6.33 must be stitched with either Eq. 6.29 or 6.30 to generate continuous 
solution describing the wave function in the entire domain of the coordinate z- This 
must be done separately for even and odd solutions. In the latter case, the continuity 
of the wave function and of its derivative at z = d/2 requires that 


B cos — = De~ Kd/2 

2 

(6.36) 

-Bk sin — = -KDe~ Kd/2 . 

2 

(6.37) 


For arbitrary values of E z , which appears in these equations via parameters k and /c, 
Eqs. 6.36 and 6.37 can have only trivial solution B = D = 0. Obviously, this is not 
what we want. However, if I insist on having non-zero solutions, I must impose a 
special condition on the allowed values of k and k. One way to derive this condition 
is to divide Eq. 6.36 by Eq. 6.37 (this is allowed because we require that B,D ^ 0) 
yielding 


kizn—=ic. (6.38) 

Taking into account Eqs. 6.28 and 6.32, you can recognize Eq. 6.38 is a transcen¬ 
dental equation for energy E z . Solutions of this equation determine the values of 
energy permitting the existence of non-zero coefficients B and D and, hence, of 
the wave functions satisfying all the boundary conditions. The solutions of Eq. 6.38 
are obviously eigenvalues of the Hamiltonian also called allowed energy values or 
energy levels. 

Equation 6.38 does not submit to an analytical solution, but it still can be 
qualitatively analyzed to help you to determine, at least, the number of solutions 
it might have. To this end, it is convenient to rewrite this equation introducing 
dimensionless variables, such as kd/2 and icd/2. To facilitate the transition to these 
variables, I first compute k 2 + k 2 using Eqs. 6.28 and 6.32: 

j 2 | 2 2 Jfl e (Vjy Vw) 

k +K ~ w • 

Multiplying this expression by d 2 / 4 and introducing dimensionless s for kd/2 , I 
have 


2 K 2 d 2 _ m e ( V b - V w ) d 2 
8 + ^~~ 2h 2 
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where I introduced another dimensionless parameter £ 0 defined as 



d \m e 04 - V w ) 


Multiplying both sides of Eq. 6.38 by d/2,1 can rewrite it now as 


tan s = 


^4 


(6.39) 


£ 


You can see that £o incorporates all relevant parameters of the system, solely 
determining the allowed energy values. This is an excellent illustration of the power 
of dimensionless variables: four different parameters have collapsed into a single 
one, which rules them all. Without even solving the equation, I know now that all 
rectangular potential wells with different values of m e , Vb — V w and d will have the 
same dimensionless energy levels as long as all these parameters correspond to the 
same so. 

Obviously, Eq. 6.39 only makes sense for s < £o, so it is important to understand 
the physical meaning of this condition. Substituting all necessary definitions, you 
can see that s = £o turns into 


m e (E z - V w ) d 2 = m e (V b - V w ) d 2 
2 fi 2 ~ 2ti 2 


^E z = V b , 


i.e., at the point s = £o an energy crosses over the potential barrier, where 
assumptions used to derive Eq. 6.39 lose their validity. 

In order to understand solutions to Eq. 6.39, it is useful to visualize graphs of its 
left-hand and right-hand sides. As s increases from zero, the function on the right 
decreases from positive infinity to zero at £ = £q, where it terminates. The left- 
hand side is a tangent, which grows from zero at £ = 0 and reaches its asymptotic 
behavior at £ = tt/2, where it jumps all the way to negative infinity, starts its 
climb toward the next zero at tt, goes to infinity at 3tt/2, and so on. If £q < tt, 
the two functions will cross only once because the right-hand side will end before 
the left-hand side manages to get to the positive territory again. Once £o, however, 
crosses the tt threshold, the second crossing becomes possible, and one more when 
£q exceeds 2tt, and so on. An important point here is that at least one even solution 
always exists no matter how small £o becomes. Another important qualitative point 
one may take home is that the magnitude of £o depends on two main parameters: 
the depth of the well Vb — V w and its geometric width d; the wider and deeper 
wells would be able to accommodate a large number of allowed energy levels, and 
in order to decrease the number of energy eigenvalues belonging to the discrete 
spectrum, one can either make the well narrower or shallower. It is important to 
remember though that the geometric width of the well affects not only values of 
allowed energies but also the difference between adjacent energy levels, which are 
closer to each other in wider wells. 
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The stitching conditions for the odd wave functions take the form 


A sin - =De~^ 2 

2 

Ak cos — = -KDe~ Kd/2 

2 

resulting to a different equation for allowed energy values 



£ 


(6.40) 

(6.41) 


(6.42) 


The right-hand side of this equation changes from negative infinity to zero at e = £q, 
while the left-hand side begins at positive infinity at s = 0 and crosses to the 
negative territory only for £ > tt/2. Thus, if £q < tt/2, this equation has no 
solutions. In this case, the only allowed eigenvalue of energy corresponds to a single 
even solution for the wave function. Increasing £ 0 beyond tt/2 will produce the first 
odd solution, and it will happen before the second even solution appears. Following 
this line of reasoning, you can see that with increasing e 0 , eigenvalues corresponding 
to odd and even wave functions appear in an alternating manner. 

The state corresponding to the lowest energy is called the ground state, and as you 
just saw, it is always represented by an even function. The next energy corresponds 
to an odd solution, then you have again an energy level corresponding to the even 
solution, then to an odd one again, and this pattern repeats until the last allowed 
eigenvalue is reached, which can be either odd or even. 

All solutions of Eqs. 6.39 and 6.42 can be enumerated as s n , where n = 1 
corresponds to the ground state (even) solution, n = 2 to the lowest in energy odd 
solution, and so on and so forth. Similar enumeration can be applied to the wave 
functions 


<Pn(z) = 


B n cos k n z 

B n cos (k n d/2) e K nd/2 e -K n \z\ 


|z| < d/2 
|z| > d/2 


n = 1,3,5... 


(6.43) 


and 


<Pn(z) 


A n sin k n z \z\<d/2 

A n sin (k„d/2) z> d/2 ,n = 2,4,6... 

-A n sin (knd/2) e K <' d l 2 e K <' z z < -d/2 


(6.44) 


Here, k n = 2e n /d, k„ = 2 — £; t /d, while the corresponding values of energy 


E zn are 


Ezn — V w + 


2 ft 1 2 


(6.45) 
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Fig. 6.5 Graphic solution of the eigenvalue equation for even and odd wave functions 


You may notice that wave functions in Eqs. 6.43 and 6.44 still contain undefined 
coefficients B n and A n correspondingly, while coefficient D was eliminated using 
Eqs. 6.36 and 6.40. This is, however, normal because all eigenvectors and represent¬ 
ing their functions are always defined only up to a constant factor (I have said that 
already and not once, right?). As usual, values of these coefficients can be fixed by 
the normalization condition 


oo 

J <Pn (z)dz = 1. 

—OO 

Figure 6.5 illustrates the process of emergence of the even and odd solutions 
described above. The graph on the left refers to Eq. 6.39 for energies of even 
states, and the graph on the right corresponds to Eq. 6.44 for energies of the odd 
states. One can see that the crossing points on the graphs signifying values of the 
dimensionless energy parameter s alternate in their values between even and odd 
states: the lowest energy value comes from the graph on the left, the second lowest 
appears in the graph on the right, and this alternation continues throughout all the ten 
energy values depicted in these plots. It is also instructive to plot the wave functions 
corresponding to a few lowest energy eigenvalues. Graphs in Fig. 6.6 present (from 
left to right) the ground state and the, first and second excited states. In addition to 
clearly demonstrating the even and odd nature of the respective states, these graphs 
reveal an important phenomenon—a transition to a higher energy level always 
adds an extra zero to the corresponding wave function. This behavior is actually 
a manifestation of a mathematical theorem valid for any one-dimensional problems 
with discrete spectrum: the number of zeroes of a wave function corresponding to 
n -th energy level (n = 1 corresponds to the ground state) is always equal to n — 1. 
Since a rigorous proof of this statement is beyond our reach, I will illustrate this 
point considering a limiting case of a very deep well, such that £o 1 • In the limit 
s o -> oo, Eq. 6.39 has solutions s n = nn/2, n = 1,3,5---, while solutions of 
Eq. 6.42 are e n = nn/2, n = 2,4,6---. The corresponding energy values from 
Eq. 6.45 coincide with those given in Eq. 5.89 (if one replaces L with d) for a 
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1.0 



1.0 



Fig. 6.6 Wave functions corresponding to the first three lowest energy eigenvalues of a rectangular 
potential well 


particle, whose motion is confined in a finite region of the total length d. Obviously, 
this confinement corresponds to the limit of the potential well with infinitely high 
barriers. The wave function in this case becomes 

( B n cos n= 1,3,5--- 
<Pn(z) = { d 

l A n sin n = 2, 4, 6 • • • 

within the well |z| < d/2, and it is exact zero outside of the well (jc n goes to 
infinity and vanquishes the exponential terms exp K n {—z + d/2) for all z > d/2 
and exp K n (z + d/2) for all z < —d/2). Now, one can clearly see how the increase 
of n by 1 transforms cos into sin adding an extra zero to the function when z changes 
from — d/2 to d/2. 

Unbound (Scattering) States: Continuous Spectrum 

The range of energies satisfying the condition E z > E\ corresponds to an unbound 
classical motion, where the entire domain — oo < z < oo becomes classically 
allowed. The motion of the classical particle depends on the combination of its 
initial position and initial velocity: if a particle is initially at z < — d/2 with velocity 
directed to the left or at z > d/2 with positive velocity, it will keep moving with the 
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same velocity—the potential well would not affect its motion at all. If, however, the 
initial motion of the particle is directed toward the well, it will experience infinite 
acceleration (or deceleration) for infinitesimally short time interval when passing 
points z = ±d/2, which will result in finite increase and then decrease of the 
particle’s speed. After passing the region of the well, the particle resumes its straight 
linear motion with the same velocity as before. 

Quantum mechanical behavior of the particle is described by the solution of the 
Schrodinger equation, which in the classically allowed region can be presented as 
a combination of exponential functions with complex arguments. In this spectral 
region, the symmetry arguments, which I used to find discrete energy levels and 
the corresponding wave functions, are no longer valid because of the inherent 
asymmetry in the initial conditions. You will see soon that this asymmetry, which is 
evident in the classical description of the unbound motion, will manifest itself in the 
quantum description as well. Therefore, it is no longer necessary to keep the origin 
of the coordinate axis at the center of the well, and the consideration becomes a bit 
more convenient if I move it to the left by d/2. In this case, the left boundary of 
the well corresponds to z = 0, and the coordinate regions, for which different wave 
functions must be written, are now defined as z < 0, 0 < z < d, and z > d. The 
most general solution for the wave function in each of these regions can be written 
down as 


A\e lkiz + B\e lkiz z < 0 
cp(z) = < A 2 e lkiz + B 2 e~ lkiz 0 < z < d 
A 3 e ihz + B 3 e~ ihz z> d 

< 


(6.46) 


where k\ = Z- m (E- — V/,) and k 2 = ^[2m (E- — V w ). This expression for the wave 
function has to be complemented by four stitching conditions—two at each of the 
points of discontinuity. Requiring continuity of the function and its derivative at 
z = 0 and z = d, I get 


(6.47) 

(6.48) 

(6.49) 

(6.50) 


Ai + Bi = A 2 + B 2 
k\ (Ai - Bi) = k 2 (A 2 - B 2 ) 

A 2 e ik2d + B 2 e~ ikld = A 3 e ik,d + B 3 e~ ik ' d 
k 2 (A 2 e ik2d - B 2 e~ ikld ) = k x (A 3 e ik ' d - B 3 e~ ik ' d ) 


where Eqs. 6.47 and 6.49 ensure continuity of the wave function at z = 0 and z = d 
correspondingly, while Eqs. 6.48 and 6.50 do the same for its derivative. Simple 
counting of the number of unknown coefficients and comparing it with the number 
of equations tells me that I have got a problem here: there are only four equations for 
six unknowns, which is one unknown too many. However, I have not yet specified a 
desirable behavior of the wave function at infinity (a boundary condition), which can 
be useful in eliminating extra unknowns. Unlike the case of the discrete spectrum, 
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where the behavior of the wave functions at infinity is uniquely prescribed, here 
I have an array of choices reflecting different physical situations for which the 
problem at hand is being used. 

Before digging into the issue of the boundary conditions at infinity for this 
problem, it might be useful to get a better physical understanding of the terms 
appearing in the expressions for cp(z). To this end, let me dust off some results from 
Sect. 5.1.3, namely, the concept of the probability current, Eq. 5.42, which in its 
one-dimensional reincarnation takes the form of 



(6.51) 


Substituting a generic form of the wave function Ate lkiZ + Bie lkiZ into Eq. 6.51, 
you find 






The first term in this expression describes a positive (directed in positive z direction) 
probability current associated with term Aie lkiZ in the wave function, while the 
second, negative, term describes a probability current in the opposite, negative z 
direction and is associated with the term B[e~ lkiZ in the wave function. Now, imagine 
a classical beam of particles of mass m e all moving with the same speed v in the 
positive direction of the z-axis. The current of particles in this beam (a number of 
particles crossing a plane perpendicular to the flow per unit time per unit area of 
the cross section of the beam) is easily found to be Nv, where N is the number of 
particles in the beam per beam’s unit volume. This expression coincides with the 
quantum mechanical probability current if you replace v with p/m e ,p with fik and 
identify \A\ 2 with N. This comparison allows interpreting the terms of the wave 
function containing e lkz as corresponding to the beam of particles propagating from 
left to right and terms with e~ lkz as describing particles propagating in the opposite 
direction. 

A typical experiment involving particles with energies in the continuous segment 
of the spectrum consists in sending particles created by some source, positioned far 
away from the potential well, toward the well and counting the number of particles 
in the beam behind the well (transmitted particles) or the number of particles in 
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front of the well but propagating in the negative z direction (reflected particles). In 
this case, the asymptotic behavior of the wave function at negative infinity must 
contain both left- and right-propagating currents, while the wave function at the 
positive infinity only contains the right-propagating particles. This gives us one 
of the possible boundary conditions at infinity corresponding to this particular 
experimental situation: <p (z —> oo) = A 2 e lkxz . For the wave function to have this 
form, coefficient B 3 in Eq. 6.46 must be set to zero. As a result, I end up with five 
unknown coefficients and the same four equations, and what is left to realize is 
that the term A\e lkxz describes the current of particles created by the source, which 
is external to the Schrodinger equation and is determined by an experimentalist 
controlling the concentration of particles in the outgoing beam. Thus, A\ shall be 
treated as a free parameter, while all remaining coefficients must be expressed in its 
terms. 

Quantities actually measured in the experiment, i.e., the fraction of particles 
reflected by the potential or the fraction of particles transmitted past the potential, 
can be interpreted quantum mechanically as probabilities of reflection R = j r /j inc 
and transmission T = j tr /jinc , where I introduced notations for the reflected current 
j r = fik\ \B\ | 2 /m e , the incident current j inc = fik\ \A\ | 2 /m e , and transmitted current 
j r = fik 2 |A 3 1 2 / m e . Wave number k 2 = yj2m e (E z — Voq) is determined by the value 
of the potential at z —> 00 , Voo. In the particular case I am dealing with now, the 
potentials at z < 0 and z > d are the same, so that V ^ = 14 and k 3 = k\. One 
should realize, however, that it is not always the case, so that one has to be careful 
when defining the transmission probability. The most general expressions for the 
reflection and transmission probabilities are 


\B X | 2 

R= 1 ' 

|Ai| 2 

(6.52) 

j, _ h 1^3 2 
~ k\ |Ai | 2 ' 

(6.53) 


Now it becomes clear that in order to obtain experimentally relevant form 
for the scattering wave function, you need to solve the following system of 
equations expressing all unknown coefficients in terms of amplitude of the incident 
particles A\\ 


1 + r = A 2 +B 2 , (6.54) 

ki (1 - r) = k 2 (A 2 - B 2 ), (6.55) 

A 2 e ikld + B 2 e~ ik2d = te ik ' d , (6.56) 

k 2 (A 2 e ikld - B 2 e~ ikld ) = k x te ik>d . (6.57) 
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Here, I introduced the amplitude reflection and transmission coefficients r = B 2 /A 1 
and t = A 3 /A 1 correspondingly and redefined amplitudes A 2 and B 2 asA 2 /Ai —> A 2 , 
and B 2 /A 1 —> B 2 . Combining the first two equations, I obtain 

( i+ l) + ( i "l) r=2A ’- 

(‘■l) + ( i+ l) r=2fl ’- 

while the other two yield 

(l + te ik ' d = 2A 2 e ik2d , 

^1 - te ik ' d = 2 B 2 e~ ik2d . 

Expressing A 2 and B 2 from the last pair of equations and substituting it in the first 
ones, I get 


K) + K)'='K) 

K)+Rb'R) 


e ik\d e ~ik 2 d 


gik\dgik 2 d 


which after some brushing up yields 

k 2 — k\ 


1 + r 

1 + r 


k 2 + k\ 
k 2 + k\ 
k\ _ 


e -ikid e ik 2 d = u 
e -ikid e ~ik 2 d = L 


Equating the left-hand sides of these two equations gives 


e -ihd e ik 2 d (j + ^ 2 — = e -i kl d e -ik 2 d (j + ± b. ,1 

V k 2 +k\ ) V k 2 - k\ ) 

( h + h 2ikid _ = _ 2ikid 

V k 2 - k\ k 2 + k\ ) 


Finally, some simple algebraic manipulations, which I hope you can reproduce 
yourselves, yield 
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{k\ — kfj sin k 2 d 

(6.58) 

(k\ + & 2 ) sin k 2 d + 2/^ 2 ^i cos ' 

2ik 2 ki 

(6.59) 

(k\ + kfj sin k 2 d + 2 ik 2 k\ cos 


Now, you can easily find two remaining coefficients A 2 and Bv. 


A 2 


b 2 


1 6 + M e m-k2)d _ ; 

2 \ k 2 J (^2 H - kf) sin kjd + 2 ik 2 k\ cos ^ J ’ 

1 (l _ M e ‘(h+k2)d _ 2ik ^ __ 

2 \ k 2 ) {k\ + kf) sin Jc 2 d + 2/&2&1 cos k 2 d 


(6.60) 

(6.61) 


Now, once the expressions for the coefficients of the wave function are found in 
terms of A\, you might wonder if it is possible and/or necessary to fix the value of 
the latter. Generally speaking, this is again a question of normalization of the wave 
function, and according to our general understanding, we must be able to normalize 
this function using the delta-function. However, when the wave function is not 
just a plane wave, the procedure becomes rather cumbersome and requires careful 
evaluation of diverging integrals. From a practical point of view, it does not make 
much sense to jump from all these hoops to achieve the normalization, which would 
matter only if you plan to use the resulting functions as a basis, and this almost never 
happens. Thus, if you are only concerned with obtaining experimentally relevant 
quantities, you will be happy leaving A\ undetermined and use Eqs. 6.58 and 6.59 
to find the transmission and reflection probabilities from Eqs. 6.52 and 6.53: 

^ (k\ — kf) 2 sin 2 k 2 d 

(k\ + k\Y sin 2 & 2 <i + 4 k\k\ cos 2 ^ 2 ^ 

T = _ 4 k\k\ _ 

(k\ + kfj 2 sin 2 k 2 d + 4 k\k\ cos 2 


(6.62) 

(6.63) 


The denominator of these expressions can be rewritten in the following form: 
(k\ + &i) 2 sin 2 k 2 d + Ak\k\ cos 2 = 4 k\k\ + (k\ — & 2 ) 2 sin 2 k 2 d. 


Thanks to this rearrangement, you can realize two important facts. First, you can 
immediately see that 


R + T = 1 


( 6 . 64 ) 








6.2 Rectangular Potential Wells and Barriers 


191 


and, second, that the transmission, considered as a function of energy, oscillates 
between its maximum value equal to unity, achieved at k 2 d = Tin, n = 1,2, 3 • - •, 
and its minimum value 


4&2 k\ 



which occurs at k^d = tt/2 + nn. For large values of energy E z 14, when k\ 
and k 2 become close to each other, the minimum value of transmission differs little 
from unity, so that the transmission remains close to one for almost all energies. The 
reflection probability, in this case, becomes correspondingly small for all energies as 
well. This is the behavior close to what you would expect from a classical particle, 
so that the higher energy limit means transition to the classical regime. This behavior 
is illustrated in Fig. 6.7. 

Equation 6.64 is an important expression of the conservation of probability— 
it simply states that since transmission and reflection are the only two mutually 
exclusive events that can occur when a particle is incident on the potential, the sum 
of their probabilities must be equal to unity. Even though this relation was derived 
here for the particular case of the rectangular well, it is valid for a generic potential 
asymptotically approaching a constant value at z ±oo. The validity of Eq. 6.64 
serves in reality as a test on correctness of Eqs. 6.62 and 6.63. Using the definitions 
of transmission and reflection coefficients in terms of the probability currents, I can 
rewrite Eq. 6.64 as 


— + — = 1 5=> jinc ~jr — jtr • (6.65) 

Jinc Jinc 

This equation establishes that the total probability current on the left of the potential 
well is equal to the probability current on its right, which is just a general statement 
of the conservation of probability, which can also be interpreted as a continuity of 
the probability current across any finite discontinuity of the potential. 


Fig. 6.7 Transmission 
probability for the rectangular 
potential well 
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Fig. 6.8 Spatial dependence of \(p(z)\ 2 for three different values of energy: low energy, high 
energy, and resonance energy where transmission goes to one and reflection to zero. The absence 
of reflection in the last plot is evidenced by the absence of oscillations of the probability density 
due to interference of the incident and reflected waves. The vertical lines delineate the edges of the 
well 


The behavior of the wave function also changes with energy. Figure 6.8 illustrates 
this point plotting the spatial dependence of the respective probability density 
|^9(z) | 2 at three different energies, including the one which corresponds to zero 
reflection. In the latter case, the probability distribution becomes flat at both z < 
—d/2 and z > d/2 signaling the absence of interference between incident and 
reflected waves. One can also notice the decrease in the period of the oscillations for 
higher energies as it should be expected because higher energy means large wave 
number and shorter wavelength. 


6.2.2 Square Potential Barrier 

Square potential barrier is a potential well turned upside down, when the higher 
value of the potential energy Vb is limited to the finite interval |z| < d/2, while the 
lower energy V w corresponds to the semi-infinite regions \z\ > d/2 outside of this 
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interval. The first principal difference between this situation and the one considered 
in the previous section is that there are no energies corresponding to a classically 
bound motion in this potential, and, therefore, there are no states corresponding to 
discrete energy levels. In both cases, E z < 14 and E z > 14, classical motion is 
unbound, and quantum mechanical states belong to the continuous spectrum (there 
are no states with E z < V w ). The difference between these energy regions is that in 
the former case, the interval \z\ < d /2 is classically forbidden, while in the latter, 
the entire domain of z-coordinate is classically allowed. Respectively, there are two 
different types of wave functions: when V w <E Z < V b 

I A x e ik2Z +B ie ~ ik2Z z < -dll 
A 2 e Kxz + B 2 e~ Klz -d/2 <z<d/2 (6.66) 

A 3 e ik2Z z > d/2 


where k 2 is defined as in the previous section, while K\ = yj2m (V w — E z ) is related 
to k\ as k\ = —ki. I already mentioned it once, but I would like to emphasize 
again—you cannot eliminate either of the real exponential functions in the second 
line of Eq. 6.66 because the requirement for the wave function to decay at infinity 
can be used only when the classically forbidden region expands to infinity. In the 
case at hand, it is limited to the region \z\ < d/ 2, so the exponential growth of the 
wave function does not have enough “room” to become a problem. For energies 
E z > Vb, the wave function has the form of 


cp{z) = 


A x e ik2Z + B x e~ ik2Z 
< A 2 e ikiz + B 2 e~ ikxz 


A 3 e ik2Z 


z < —d/2 
—d/2 < z < d/2 
z > d/2. 


(6.67) 


This wave function is essentially equivalent to the one considered in the case of 
the potential well, so you can simply copy Eqs. 6.58 and 6.59 while exchanging k\ 
and k 2 \ 


e~ ik2d (k\ - k]) sin k\d 
(k\ + kfj sin k\d + 2ik 2 k\ cos k\d 

2 ik 2 kie~ ik2d 

(k\ + k f) sink\d + 2ik 2 k\ cos^J 


( 6 . 68 ) 

(6.69) 


Transmission and reflection coefficients in this case have all the same properties as 
in the case of the potential well, which I am not going to repeat again. 

Going back to the case V w < E z < 14, it might appear that here I would have 
to carry out all the calculations from scratch because now I have to deal with real 
exponential functions. But fear not, you still can use the previous result by replacing 
k\ with k\ = —//Ci. The negative sign in this expression is important—it ensures 
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that the coefficient A 2 in Eq. 6.67 goes over to the same coefficient A 2 in Eq. 6.66 
(the same obviously applies to coefficients B 2 ). In order to finish the transformation 
of Eqs. 6.68 and 6.69 for the under-the-barrier case, you just need to recall that 
sin(k) = /sinhv and cos (or) = coshv. With these relations in mind, you easily 
obtain 


B ie lkld {k\ + /c 2 ) sinh/cid 

1 — i (k\ — /c 2 ) sinh/ci<i + 2k 2 K\ cosh/ri d 

_ 2 k 2 K X e~ ikld 

3 —i (k\ — /c 2 ) sinh/ci<i + 2k 2 K\ cosh/ci d 

The respective transmission and reflection coefficients become 

(k\ + k\) 2 sinh 2 \c\d 

R — - 2 - 

(k\ — /c 2 ) sinh 2 K\d + 4£|/c 2 cosh 2 K\d 

4^2^? 

T = -2- — -• 

(&2 — /r 2 ) sinh 2 K\d + 4 1c^k\ cosh 2 K\d 

Even though I derived Eqs. 6.70 and 6.71 by merely extending Eqs. 6.68 and 6.69 
to the region of imaginary k\ (for mathematically sophisticated—this procedure is 
a simple example of what is known in mathematics as analytical continuation), the 
properties of the reflection and transmission coefficients given by Eq. 6.72 are very 
different from those derived for the over-the-barrier transmission case E z > V/ 9 . 
Gone are their periodic dependence on the energy and d , as well as special values 
of energy, when the transmission turns to unity and reflection goes to zero. What 
do we have instead? Actually quite a boring picture: transmission is exponentially 
decreasing with increasing width of the barrier d and slowly approaches unity as 
the energy swings between V w and Vb because K\ — yj\2m e (Vb — E z ) vanishes at 
E z = Vb . To illustrate the exponential dependence of the transmission on d , I will 
consider a case of a “thick” barrier, which in mathematical language means K\d 
1. To find the required approximate expression for T and R , I need to remind you a 
simple property of hyperbolic functions coshv and sinhv: for large values of their 
argument jc, these functions can be approximated by a simple exponential, sinh v ~ 
coshv ~ \ expv. Taking this into account, I can derive 


(6.70) 

(6.71) 

(6.72) 

(6.73) 


R ~ 1 


T ~ 


4 k\ic\ 


e~ AK ' d 


(6.74) 

(6.75) 


When deriving the expression for the reflection coefficient, I “lost” the exponentially 
small term, which is supposed to be subtracted from unity to ensure conservation 
of probability. At the same time, this term makes the main contribution to the 
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transmission coefficient and, therefore, survives. A better approximation for the 
reflection coefficient can be found simply by writing it down as R = 1 — T. 
Obviously, the same results can be derived directly from Eq. 6.72 by being a bit 
more careful and keeping leading exponentially small terms. 

What is surprising here is, of course, not the fact that the transmission is small, 
but that it is not exactly equal to zero. Because what it means is that there exists a 
non-zero probability for the particle to travel across a classically forbidden region, 
emerge on the other side, and keep moving as a free particle. This phenomenon, 
which is a quantum mechanical version of “walking through the wall,” is called 
tunneling, and you can hear physicists saying that the particle tunnels through 
the barrier. The exponential nature of the dependence upon d is very important, 
because exponential function is one of the fastest changing functions appearing 
in mathematical description of natural processes. It means that a small change 
in d results in a substantial change in transmission. This effect has vast practical 
importance and is used in many applications such as tunneling diodes, tunneling 
microscopy, flush memory, etc. 


6.3 Delta-Functional Potential 

In this section, I will present a rather peculiar model potential, which does not really 
have direct analogies in the real world. I can justify spending some time on it by 
making three simple points: (a) it is easily solvable, so considering it would not 
take too much of our time, (b) it is useful as an illustration of a situation when the 
derivative of the wave function loses its continuity property, and (c) in the case of 
shallow potential wells, which are able to hold only a single bound state, it can 
provide a decent qualitative understanding of real physical situations. This utterly 
unrealistic potential has the form of a delta-function 

y = -gS(z), (6.76) 

where the negative sign signifies that the potential is attractive and that the states 
with negative energies are possible and must belong to the discrete spectrum. 
Indeed, the entire region of z except of a single point z = 0 is classically forbidden, 
so the motion of a classical particle, if one can imagine being localized to a single 
point as a motion, is finite. Parameter g in this expression represents a “strength” 
of the potential, but one needs to understand that the dimension of this parameter is 
energy x length , so it should not be interpreted as a “magnitude” of the potential. 
It becomes obvious if one integrates Eq. 6.76: g = f V{z)dz , so it is clear that g is 
the area under the potential. If one thinks of the delta-function as a limiting case of 
a rectangular potential of depth V w and width d, with V w —> oo and d —> 0, in such 
a way that g = V w d remains constant, the meaning of this parameter becomes even 
more transparent. 
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The main peculiarity of this model is that the discontinuity of the potential in this 
case involves more than just a finite jump, so that my previous arguments concerning 
the continuity of the derivative of the wave function are no longer applicable. 
Actually, this derivative is not continuous at all, and the first matter of business 
is to figure out how to “stitch” derivatives of the wave function defined at z < 0 
with those defined at z > 0. To solve this puzzle, let me start with the basics—the 
Schrodinger equation 


fi 2 d 2 cp 

--——j - g8(z)<p(z) = E<p(z). (6.77) 

2 m e dz 

Integrating this equation over infinitesimally small interval —6,6 and taking into 
account that an integral of a continuous function over such an interval is zero (in the 
limit 6 0), I get 


2A 

(‘E 

2 m e ' 

( dz 



- s<p(0) = 0. 


This yields the derivative stitching rule: 


d<p dip 

dz z=€ dz 


2 m e 
fi 2 


5 ^( 0 ). 


(6.78) 


Now, all what I need is to solve the Schrodinger equation with zero potential and 
negative energy separately for z < 0 and z > 0 and stitch the solutions. Since both 
these regions are classically forbidden for a particle with E < 0, the solutions have 
the form of real-valued exponential functions: 


\A x e KZ z< 0 

<p(z) = { 

l A 2 e~ KZ z> 0 

where k = *J—2m e E/fi , and I discarded the contributions which would grow 
exponentially at positive and negative infinities to satisfy the boundary conditions. 
A continuity of the wave function at z = 0 requires that A 2 = A 1 , and Eq. 6.78 
yields 


—2 kA = — 


2 m e 


gA. 


Assuming that A is non-zero (naturally) and taking into account the definition of k, 
I find that this expression is reduced to the equation for allowed energy levels: 


m e g 2 

2h 2 ' 


(6.79) 
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Obviously, Eq. 6.79 shows that there is only one such energy, which is why this 
model can only be useful for description of shallow potential wells with a single 
discrete energy level. 

Solutions with positive energies can be constructed in the same way as it was 
done for the rectangular potential well or barrier 


(p(z) = 


A | e' kz + B x e~ ikz 
A 2 e ikz 


z < 0 
z > 0 


(6.80) 


where k = *Jlm e E/fi, and the continuity of the wave function at z = 0 yields 


A\ +B i = A 2 . 


The derivative stitching condition, Eq. 6.78, generates the following equation: 

, x 2 m e 

ikA 2 - ik (A\ -B i) = -—-gA 2 . 

fi z 

Solving these two equations for B\ and A 2 , one can obtain 

M = 1 

Ai 1 -i% 

Ih_ = iX 

Ai 1 -if 

where I introduced a convenient dimensionless parameter / defined as 

m e g 

x = w 

The amplitude transmission and reflection coefficients t = A 2 /A\ and r = B\/A\ 
are complex numbers, which can be presented in the exponential form using Euler 
formula as 


r = VRe i0r ; t = Vfe ie> 


where reflection and transmission probabilities R = |r| 2 and T = \t\ 2 and 

corresponding phases 9 r and 6 t are given by 


X 2 1 

n __ /l . y __ _ 

~i + x 2 ’ ~i + x 2 

6 r = — arctan —; 9 t = arctan/. 
X 


(6.81) 

(6.82) 
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While the phase of the amplitude reflection and transmission coefficients do not 
affect the probabilities, they still play an important role and can be observed. The 
reflected wave function interferes with the function describing incident particles 
and determines the spatial distribution of relative probabilities of position mea¬ 
surements. These phases also define the temporal behavior of the particles in the 
situations involving nonstationary states, but discussion of this situation is outside 
of the scope of this book. 


6.4 Problems 
Problems for Sect. 6.2.1 
Problem 77 Derive Eq. 6.42. 

Problem 78 Find the reflection and transmission coefficients for the potential 
barrier shown in Fig. 6.9. Show that R + T = 1. 

Problem 79 In quantum tunneling, the penetration probability is sensitive to slight 
changes in the height and/or width of the barrier. Consider an electron with energy 
E = 15eV incident on a rectangular barrier of height V = 7eV and width 
d = 1.8 nm. By what factor does the penetration probability change if the width 
is decreased to d = 1.7 nm? 

Problem 80 Consider a step potential 


V(z) = 


(o 

| Vo 


x < 0 

x > 0. 


Calculate the reflection and transmission probabilities for two cases 0 < E < Vo 
and E > Vo- 


v, 

V, 


Fig. 6.9 Potential barrier with an asymmetric potential 
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Problem 81 Find an equation for the energy levels of a particle of mass m e moving 
in a potential of the form 


oo 

X 

< —a 

0 

- 

a < x < —b 

Vo 

- 

b < x < b 

0 

b 

< x < a 

oo 

X 

> a. 


Consider even and odd wave functions separately. Using any graphic software, find 
the approximate values of the two lowest values of the energy if m = 1.78 x 
10 _27 kg, a = 0.12nm, b = 0.42 nm, V 0 = 1.5 eV. Sketch the respective wave 
functions for each of the found eigenvalues. 

Problem 82 Consider a particle moving in a potential comprised of two attractive 
delta-functional potentials separated by a distance d\ 

V(x) = —g8 (x + d/2) — g8 (x — d/2 ). 


1. Derive an equation for discrete energy levels in this potential, and solve it if 
possible. How many discrete energy levels does this potential have? Analyze the 
behavior of these energy levels when the distance d between the wells increases. 

2. Find the wave functions corresponding to the continuous segment of the spec¬ 
trum, and determine the respective transmission and reflection probabilities. 



Chapter 7 

Harmonic Oscillator Models 


Check for 
updates 




It is as difficult to overestimate the role of harmonic oscillator models in physics in 
general and in quantum mechanics in particular as the influence of Beatles and Led 
Zeppelin on modem popular music. Harmonic oscillators are ubiquitous and appear 
every time when one is dealing with a system that has a state of equilibrium in the 
vicinity of which it can oscillate, i.e., in a vast majority of physical systems—atoms, 
molecules, solids, electromagnetic field, etc. It also does not hurt their popularity 
that the harmonic oscillator is one of the very few models which can be solved 
exactly. 

Consider a particle moving in a potential V(x,y,z ), which has a minimum at 
some point x = y = z = 0. Mathematically speaking, this means that at this 
point dV/dx = dV/dy = dV/dz = 0, while the matrix of the second derivatives 
Lij = d 2 V/dridrj\ x=y=z=0 , where r\ = x, = y, and r 3 = z, is positive definite. 
If you still remember the connection between the potential energy and the force in 
classical mechanics, you should recognize that in this situation, point v = y = z = 0 
corresponds to the particle being in the state of stable equilibrium. Stable in this 
context means that a particle removed from the equilibrium by a small distance will 
be forced to move back toward it rather than away from it. Expanding potential 
energy in a power series in the vicinity of the equilibrium and keeping only the first 
nonvanishing terms, you will get 



Respective classical Hamiltonian equations 3.2 and 3.3 yield for this potential: 



(7.1) 


j 
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dn _ Pi_ 
dt m e 


(7.2) 


(i = x,y,z ). They can be converted into Newton’s equations by differentiating 
Eq. 7.2 (with respect to time) and eliminating the resulting time derivative of the 
momentum using Eq. 7.1: 


d A - -—YL r 

dfi m e 4 - J J ' 


(7.3) 


The presence in matrix L t j of nondiagonal elements indicates that the particle’s 
motion in the direction of any of the chosen axes X, Y, or Z is not independent 
of its motion in other directions. In layman’s terms, it means that it is impossible 
to arrange for this particle to move purely in the direction of any of the axes. 
Nevertheless, solutions of these equations still can be presented in the standard 
time-harmonic form r t = a t exp (icot) with amplitudes a t and frequency co obeying 
equations: 


— X, L ‘i a i = 0)2 at ' ( 7 - 4 ) 

m e . 

which is an eigenvalue equation for the matrix Li^m e . It is obvious that this is 
symmetric (Lq = Lj ti ), real-valued, and, therefore, Hermitian matrix. Thus, based 
on the eigenvalue theorems discussed in Sect. 3.3.1, this matrix is guaranteed to 
have real eigenvalues and corresponding orthogonal eigenvectors. The equation for 
the eigenvalues is found by requiring that Eq. 7.4 has nontrivial solutions: 

det (m e G 0 2 8 ij — = 0 

and in general has three solutions co 2 , where n = 1,2,3. Substituting each of 
these frequencies back in Eq. 7.4, you can find amplitudes a^\ af* af*, which 
form corresponding eigenvectors. These eigenvectors are regular three-dimensional 
vectors defining three mutually orthogonal directions in space. Oscillations in 
each of these directions, called normal modes, are characterized by their unique 
frequencies co 2 and can occur independently of each other. Indeed, these three 
vectors can be used as a new basis, which in this particular case amounts to 
introducing new coordinate axes along the directions of the normal modes. The 
matrix Lij/m e transformed to this basis becomes diagonal, and introducing notation 
£i, £ 2 , and £ 3 , to represent coordinates along these new directions, Eq. 7.3 will take 
a form of three independent differential equations: 
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One can also show (those interested in details are welcome to read any of many text¬ 
books on classical mechanics, or molecular oscillations, or a combination thereof) 
that Hamiltonian written in terms of these new coordinates and corresponding 
conjugated momentums n n takes the form 



which is the sum of three independent one-dimensional Hamiltonians. The transition 
to this form is not so trivial, and the mass parameter m n does not have to coincide 
with the actual mass of the particle. Nevertheless, as long as 7 t n and are a 
canonically conjugated pair characterized by the standard for the coordinate and 
momentum Poisson brackets, Eq. 3.5, we can treat them as such for all practical 
purposes, including quantization. 

Thus, using the concept of normal modes, one can always reduce a prob¬ 
lem involving harmonic oscillations to a simple combination of one-dimensional 
problems. This is actually true even in the case involving oscillations of several 
particles such as multi-atom molecules. Therefore, using the one-dimensional model 
to describe harmonic oscillations is even more justified than the one-dimensional 
models described in the previous chapter. And so, the one-dimensional model of the 
quantum harmonic oscillator is what I am going to consider next. 


7.1 One-Dimensional Harmonic Oscillator 

7.1.1 Stationary States (Eigenvalues and Eigenvectors) 

Classical mechanics of the one-dimensional harmonic oscillator is described by 
Hamiltonian: 


p 2 1 9 9 

H = -1— m e (ox , 


(7.6) 


and respective Hamiltonian equations for momentum p and coordinate x are 
obtained by specializing Eqs. 7.1 and 7.2 to the one-dimensional situation: 


dp 


= —m e co 2 x 


(7.7) 


dt 

dx p 
dt m e 


(7.8) 


where I replaced the corresponding diagonal element of matrix Lq as L xx = m e co 2 . 
These equations are, of course, easy to solve, and the solution is well known: 
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X = Xq cos cot 


P o 
m e co 


sin cot 


p = —m e coxo sin cot + /?o cos 


(7.9) 


where Vo and po are initial values of the coordinate and momentum of the 
particle. Equation 7.9 describes a familiar harmonic time dependence, which can 
be presented in terms of amplitude A and initial phase 8 : 

x(t) = A sin (cot + 8 ). 


Both A and 8 are determined by the initial conditions: the amplitude—by the total 
energy E of the oscillator, which, as you know, is a conserving quantity—and phase, 
by the ratio of the initial coordinate and momentum. Recalling that at the maximum 
displacement E takes entirely the form of the potential energy, you can write 

- m e co 2 A 2 = E = + -m e co 2 xl => 

2 2 m e 2 0 

A= fHL 2 = L + 4- 2 . (7.10) 

y m e co z V m z co z 

The phase of the oscillator can be found by expanding sin (cot + S) = sin cot cos 8 + 
cos cot sin 8 and equating the resulting terms with their counterparts in Eq. 7.9. This 
yields 


A cos 8 
A sin 8 


Po 

m e co 

x 0 


and subsequently 


x 0 m e co 

tan o = -. 

Po 

Obviously, the motion of a harmonic oscillator is bounded with the maximum 
deviation from the equilibrium position given by its amplitude A, Eq. 7.10. The 
coordinate x becomes equal to A at two turning points, where the velocity of the 
oscillator and, respectively, its kinetic energy turn to zero. The relation between 
total, potential, and kinetic energies of the harmonic oscillator can be illustrated 
by a diagram shown in Fig. 7.1, where vertical lines show the turning points of the 
classical motion. 

Even though you all have known the solution to the harmonic oscillator problem 
almost since the elementary school, you might find it useful to play with its 
Hamiltonian a bit more. Let me, for instance, factorize the Hamiltonian, taking 
advantage of its u 2 + v 2 form, which can be presented as (u + iv ) (u — iv ): 
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Fig. 7.1 Energy diagram for 
a classical oscillator. The 
horizontal line corresponds to 
its total energy E, and vertical 
dashed lines indicate the 
turning points and 
coordinates corresponding to 
two maximum displacements 



H = 



(7.11) 


Now, on a whim, I am going to compute the Poisson bracket involving these factors. 
Designating the first of them as u, 


p m 

u = - + iJ —cox, 

2 m e V 7 


and the second one as w*, 


p I m e 

w = - — iJ — cox, 

2 m e V 7 


I find, using Eq. 3.4 for Poisson bracket, 

* 3 u 3 u* 3 u du* 

\u,u } = -— --—-— = ico. 

dx dp dp dx 

Using this result as a hint, I now introduce new variables: 


, i m eM . p 

b = - r= U = A I ——X — l 

yw 


\J2m e co 


b *=i u * = -ip^ x+ r 
V 2 V 2m e oo 


(7.12) 


whose Poisson bracket, by design, of course, is 

{b,rf} = 1. 


(7.13) 
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This means that b and b ^ constitute a canonically conjugated pair (if you have 
already forgotten what I am talking about, check Sect. 3.1), with b playing the role 
of the coordinate and b ^ pretending to be the momentum. Computing btf (do it!), 
you will see that the Hamiltonian can be presented as 


H = ioobtf. 


The corresponding Hamiltonian equations are 


db 

dt 

dtf 

dt 


dH • h 

3bi= mb 


m 

'~db 


— = -mb'. 


(7.14) 

(7.15) 


The advantage of these equations as compared to initial Eqs. 7.7 and 7.8 is that they 
are independent first-order differential equations, which can be easily solved: 


b = boe iat \ = ble~ imt . (7.16) 

Initial coordinate and momentum can be expressed in terms of b and b f by inverting 
Eqs. 7.12 and 7.13, but I will leave it for you as an exercise. 

Transition between pairs x,p and b,b^ is an example of a so-called canonical 
transformation of variables, and the only reason I decided to bother you with it is 
that it paves a way to better understanding its quantum analog, which is of crucial 
importance. According to the quantization rules discussed in Sect. 3.3.2, transition 
from classical to quantum description consists in promoting classical variables to 
quantum operators, and the coordinate-momentum dyad plays a crucial role in the 
process, namely, because it is a canonical pair. The operators replacing classical 
variables are, to a large extent, defined by their commutation relations, and in the 
case of canonical pairs, the commutator is directly linked to the respective Poisson 
brackets, as I have already mentioned previously. In the case of the coordinate- 
momentum pair, the corresponding commutator is obtained from the Poisson 
bracket by multiplying it by ifi. As well as any pair of variables characterized 
by the canonical Poisson bracket, which play the role similar to coordinate and 
momentum in classical mechanics, any quantum mechanical pair of operators with 
canonical commutator ifi will have properties similar to those of the coordinate 
and momentum. For instance, if I know that two Hermitian operators v and <r 
have a commutator [0, &] = ifi , I can without any doubts claim that the operator 
<r in the representation based on eigenvectors of v is & v = —ifid/dv similar 
to the momentum operator in the coordinate representation. I have to emphasize, 
however, the requirement that the operators must be Hermitian. Therefore, if I were 
to promote b and b ^ to operators, it would not work, because they would not be 
Hermitian. Nevertheless, operators similar to b and b ^ (while not exactly like them) 
do play an important role in quantum theory (and not just for harmonic oscillators). 
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Now I am ready to get down to our main business and start developing quantum 
theory of harmonic oscillators. The goal is to develop the theory as far as possible 
without resorting to any particular representation for momentum and coordinate 
operators. Such an approach will produce the most general results, independent of 
a representation, offer important insights into the quantum properties of oscillators, 
and create a formal framework for extending this theory beyond pure mechanical 
harmonic oscillators. 

I start by “factorizing” the quantum Hamiltonian in a way similar to factorization 
of the classic Hamiltonian in Eq. 7.11. However, to make sure that this factorization 
works for operators, I would like to review the origin of the identity: 

u 2 + v 2 = (u + iv) (u — iv) . 

Removing the parentheses on its right-hand side, I have 

(u + iv) (u — iv) = u 2 + v 2 + ivu — iuv. 


If u and v are regular variables, the last two terms in this expression cancel, but if 
they are non-commuting operators, it is quite obvious that the original factorization 
rule is no longer true and must be corrected: 

u 2 + v 2 = (u + iv) (u — iv) + i [u , v ]. (7.17) 

The order of the terms in the parentheses on the right-hand side of this expression 
can be changed, which will result in an alternative form of the identity: 

u 2 + v 2 = (u — iv) (u + iv) — i [ u , v ]. (7.18) 

Identifying u and v as 


I find 



(7.19) 

(7.20) 


[u, 0 ] 


-CD \x, p] = -iflCD. 
2 1 2 


(7.21) 


Operators u and v have a dimension of jenergy , while their commutator, pro¬ 
portional to fico, obviously has the dimension of energy. If I am not mistaken, I 
have already remarked that it is often quite beneficial to work with dimensionless 
quantities. Thus, taking a clue from Eqs. 7.17 and 7.18 and the experience gained 
working with classical Hamiltonian, I will try to generate dimensionless operators 
such as 
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a = 


Vfuo 


(u + iv) = 


m e co „ p 

~2h X + l 


+J2m e fia) 


Vhco 

The commutator of these operators is 

1 


t 1 ~ ... m e CQ^ . p 

' V 2ft *j2m e tico 


[a, fi'] = —i 


m e (D 


2fi < s /2m e tico 


[x,p] + i 


m e co 


1 


2 fi 2m e fico 


\P>x\ = 




(7.22) 

(7.23) 


Due to a special importance of this result, I will reproduce it as a separate numbered 
formula: 


[cl f / 1 ] = 1. 


(7.24) 


The operators a and a? are clearly not Hermitian: performing Hermitian conjugation 
of Eqs. 7.22 and 7.23, you can immediately see that they are actually Hermitian 
conjugates of each other, hence the notation o'. It will also be useful to express 
coordinate and momentum operators in terms of a and o'. Adding and subtracting 
Eqs. 7.22 and 7.23, 1 can invert these equations to get 



(7.25) 

(7.26) 


Using the operator factorization identities 7.17 or 7.18 with u and v defined in 
Eqs. 7.19 and 7.20, 1 can derive two alternative forms of the Hamiltonian: 


H = tico + a aj = fico - + aa' J . 

These two expressions differ by the order of the operators in it and by the sign in 
front of 1/2. Formally they are absolutely equivalent, and one can be reduced to 
another using commutation relation 7.24. However, from a practical point of view 
(and you will have to trust me on this for now), the first of these expressions is 
much more convenient to use than the other. Thus, in what follows, I will rely on 
the representation of the Hamiltonian in the form 
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H = h(o(^+a i ci\. {121) 

Our first task is to find the eigenvalues and eigenvectors of this Hamiltonian, 
i.e., the stationary states of the harmonic oscillator. Since the classical motion in 
the harmonic potential is bound for all values of energy, it should be expected 
that the entire spectrum of the Hamiltonian is discrete so that yet unknown energy 
eigenvalues can be labeled by a discrete index as E n and the respective eigenvectors 
as \E n ): 


H\E n ) =E n \E n ). 


(7.28) 


Since I am not allowed to use any particular representation for the coordinate and 
momentum operators, all what I have to go on with are the commutation relations. 
This invites me to use the same purely algebraic technique, which I successfully 
used previously when searching for eigenvalues of the operators of the angular 
momentum in Sect. 3.3.4. However, in the role of the angular momentum ladder 
operators L±, I am going to cast operators a and a\ which appear to have some 
similarities with L ±: they are also non-Hermitian and are Hermitian conjugates 
of each other. You might remember that operators L± applied to an eigenvector 
of operator L z generate other eigenvectors with decreased or increased eigenvalue. 
Will you be surprised if it turns out that operators a and o' are doing the same to the 
eigenvectors of the harmonic oscillator? Probably not. 

The first step is to note that eigenvectors of the Hamiltonian coincide with those 
of the operator N = a^a, which is called the number operator and is obviously 
Hermitian. Indeed, once you rewrite the Hamiltonian as 

H = ho> Q+w 

this statement becomes pretty obvious. Moreover, you can immediately see that if 
X n is the eigenvalue of N: N \E n ) = X n \E n ), then 


E n = tlQ) 



(7.29) 


Therefore, I can focus my attention on finding the eigenvalues and eigenvectors of 
the number operator N. To this end, I first compute the commutator | ~N, aj (if you 
want to know what prompted me to do so, the only excuse I can offer is that there 
isn’t much more for me to do, so why not do that?): 



= a f a 2 


aa r a = 


(a { a — aa') a = —a, 


(7.30) 
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where I took advantage of Eq. 7.24. Carrying out Hermitian conjugation of this 
result, and remembering to change the order of the operators in their product after 
Hermitian conjugation, I immediately obtain 

(7.31) 

In the next step, I consider Na \E n ) and use the commutation relation 7.30 to get 

Na | E n ) = —a \E n ) + aN \E n ) = 

X n a | E n ) — a | E n ) = (X n — 1) a | E n ). 

This result shows that a | E n ) is an eigenvector of N with eigenvalue X n — 1, i.e., 
the operator a generates eigenvectors of N with eigenvalues decreasing by one with 
each application of the operator. Not surprisingly, this operator is called lowering 
operator. The questions, which naturally pop up at this point, are how far down in 
energy one can go and how one knows when the bottom is reached. The answer to 
the first question is obvious—that energy eigenvalues of the harmonic oscillator 
can never be negative, and thus X n > —1/2. The second question is answered 
by recycling arguments that I have already used when discussing the angular 
momentum—the only way to reconcile the ability of a to keep decreasing X n every 
time it is applied and the requirement that there must exist a smallest X n is to impose 
on the eigenvector corresponding to this minimum value condition: 

a | E min ) = 0. (7.32) 

Another useful relation is obtained by performing Hermitian conjugation of this 
equation: 


{Emin 1^ = 0. (7.33) 

Now you are going to appreciate the wisdom of writing the Hamiltonian in the form 
of Eq. 7.27 and of introducing operator N. Indeed Eq. 7.32 used in N \E min ) gives 
N | E min ) = 0, which means that the minimum value X min = 0, and E min = hoof 2. 
So, behold the power of the lowering operator—we found the bottom, the lowest 
possible energy of a harmonic oscillator, its ground state! 

Just like in other examples, the lowest energy is not zero, which is, of course, 
the consequence of the uncertainty principle: zero energy would require that both 
kinetic and potential energies are equal to zero, which would mean that both 
coordinate and momentum operators would have certain values of zero, which is 
impossible. The ground state energy fetw/2 is one of the clearest examples of the 
energy associated with so-called quantum fluctuations. 

The contribution of these fluctuations to the energy can be quantified by 
computing the expectation values of p 2 and x 2 , which determine the quantum 
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uncertainties of the respective observables. Using Eqs. 7.25 and 7.26 that express 
operators p and x in terms of operators a and a ^ in conjunction with Eqs. 7.32 
and 7.33, you can immediately see that 

{Emin I X I E m in ) — {Emin I P I f/«i« ) — 0, 

so that the uncertainties A p and Ax are Ap = yj {p 2 ) and Ax = yj {x 2 ). Squaring 
Eqs. 7.25 and 7.26, and computing these expectation values with state | E min ), you 
will get 


{Emin | % | E m i n ) — ~ ({Emin \ & \ E m in ) H" {Emin \ eft \ E m 

2 m e co 

{Emin | eft d | E m in) {E m in \ eieft \Emin)} • 


The first three terms in this expression vanish, thanks to Eqs. 7.32 and 7.33. 
However, the last term requires some more efforts because the order of operators 
a and eft in it is “wrong” in the sense that it is not conducive to the immediate 
application of Eqs. 7.32 and 7.33. The situation, however, can be quite easily 
rectified by using the commutation relations 7.24 to change this order and rewrite 
this term as 


{Emin \ | E m in) — {Emin \ 1 T" (ft d \ E m in) — 1 ? 


where I, as usual, assumed that whatever the state vector | E min ) is, it is normalized. 
Thus, finally, I find 


{Emin | % | E m in ) — 


2 m e co 


(7.34) 


Similarly, 


{Emin | P | E m in ) 


m e o)fi 


( {Emin | ^ | E m in ) T ftE m [ n \ (ft \ E m i n ) 


{Emin \ eftd | E m in) {Emin \ ei(ft \E m ini) 

m e Q)ti + m e cofi 

Z {Emin | eid | E m in) — Z • 


(7.35) 


Using Eqs. 7.34 and 7.35 in expressions for kinetic and potential energies, p 2 /2m e 
and m e co 2 x 2 / 2,1 immediately find that the ground state expectation values (p 2 ) /2 m e 
and m e co 2 (v 2 ) /2 are both equal to tico/4. Isn’t it remarkable that while the ground 
state of harmonic oscillator is characterized by a certain value of energy fico/ 2, it 
is formed by two fluctuating quantities, kinetic and potential energies, contributing 
equal amounts? One can actually see here a certain analogy with classical harmonic 
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oscillator, whose energy, while being time independent, includes contributions from 
kinetic and potential energies, whose time dependencies totally compensate each 
other yielding a constant sum. 

OK, by finding the energy of the ground state, I took you down to the very bottom 
of the energy valley. Now it is time to climb back up, and we are going to do it with 
the assistance of ... wait for it..., of course, the operator eft ! Actually, there is not 
much surprise or suspense here because this is exactly what happened with angular 
momentum operators: we used L_ to find the lowest eigenvalue and operator L+ to 
move up from there. My next step is pretty obvious now—consider Ncft | E n ): 

Ntf | E n ) = I En) + a*N I E n ) = 

Xna' | E„) + | E n ) = (A„ + 1) a f \E n ) 


where this time I used commutation relation from Eq. 7.31. So, as expected, a' \E n ) 
is an eigenvector of the number operator with eigenvalue A„ + 1, i.e., operator a' 
does generate eigenvectors with eigenvalues increasing by one for each application 
of the operator. Starting with the ground state, for which X min = 0, operator eft will 
generate eigenvectors with eigenvalues of N equal to 1,2, 3 • • •. In other words, the 
eigenvalues of the number operator are all natural numbers n starting with 0, which 
make energy levels of quantum harmonic oscillator, according to Eq. 7.29, equal to 


E n = fico 



n = 0,1,2— . 


(7.36) 


What is left for us now is to find the corresponding eigenvectors, for which, from 
now on, I will use the simplified notation | n). All what I know at this point is that 
if | n) is an eigenvector corresponding to the eigenvalue of the number operator n , 
then eft | n) is an eigenvector corresponding to the eigenvalue n + 1. But I cannot 
guarantee that this new eigenvector will be normalized even if | n) is. Therefore, 
reserving the bra and ket notation only for normalized vectors, the best I can write 
for now is 


a f \n)=c„\n + l), (7.37) 

where \n + 1) is assumed normalized and c n is yet an unknown normalization factor. 
Again, I cannot help but remind you that we encountered exactly the same situation 

when discussing eigenvectors of L 2 . To find c n I, first, write down a Hermitian 
conjugated version of Eq. 7.37: 


(?r| a = c* (n + 1| . (7.38) 

Then, multiplying left-hand and right-hand sides of Eqs. 7.37 and 7.38, 1 get 


(n\ aeft \n) = \c n \ 2 (n + 1| n + 1). 
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Using commutation relation 7.24 and taking into account that all vectors are now 
assumed normalized, I have 

(n\N + 1 | n) = \c n \ 2 =4> |c n | 2 = /? + 1. 

Taking advantage of the freedom in the choice of the phase of the normalization 
factor, I choose c n to be real positive. Now I have the rule for generating new 
normalized eigenvectors: 


k* +1) = 


s/n 1 


o' | n) . 


Applying this rule sequentially starting with the ground state, I end up with the 
following expression for an arbitrary eigenvector | n): 


1 


I n) = -= (at)" |0), 


(7.39) 


where |0) stands for the eigenvector corresponding to the ground state. One can also 
show that 


a | n) = \fn \n — 1), (7.40) 

but I will leave a proof of this relation as an exercise. 

Equation 7.39 relates eigenvectors describing excited stationary states of the 
oscillator to its ground state. The latter, however, might appear to you to be 
undetermined, which is true if by “determining” it you mean expressing it in terms 
of some known vectors or functions. However, for most purposes, all information 
that you need about the ground state is contained in Eq. 7.32, and in this sense, 
this equation is the definition of the ground state. You can use it to find answers 
to any specific question pertaining to this state. For instance, if you are interested 
in a function representing this state in coordinate representation, you can use 
the coordinate representation of the momentum and coordinate operators to turn 
Eq. 7 .32 into an easy-to-solve differential equation for (po(x) = (x\ E min )\ 


m e co 
~2h " 


fi d 
2m e co dx 


<Po(x) = o 


dcpoQc) 

dx 


m e a) 

- -j— X( Po(x) 


(po = Cexp 



(7.41) 
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Parameter £ appearing in this equation is defined as 


£ = 



(7.42) 


and has the dimension of length. It specifies the characteristic scale of the spatial 
dependence of the wave function: for v £ the wave function is almost constant, 
while for v £ its behavior crosses over to a steep descent. It is easy to see that 
this parameter characterizes a transition between classically allowed and classically 
forbidden regions of coordinates for the harmonic oscillator. Indeed, the substitution 
of quantum ground state energy E = fico/2 to Eq. 7.10 for the amplitude A of 
classical oscillator yields A = £, which means that for the ground state of the 
oscillator v < £ corresponds to the classically allowed region, and the region v > £ 
is classically forbidden. 

Integration constant C in Eq. 7.41 is found from the normalization condition: 



dx = C 2 £ 


oo 

J exp (—x 2 ) dx 


C 2 v / ^r£ = 1 


where I computed the integral by introducing a dimensionless variable x = v/£ and 
using a known value of the Gaussian integral exp (—y 2 ) dy = . Thus, the 

normalized version of the oscillator ground state wave function becomes 


Vo = 



(7.43) 


Having found the normalized ground state wave function in the coordinate rep¬ 
resentation, I can now use the raising operator (also rewritten in the coordinate 
representation) to generate wave functions representing an arbitrary stationary state 
of the Hamiltonian: 


<Pn(x) = 



Here I used the coordinate representation for the raising operator expressed in terms 
of dimensionless variable x: 


d f = 


V2£ 



| d 


h d 
s/2mJuo dx 
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substituted in Eq. 7.39. You can easily convince yourselves that expression 


/ d Y 

( Y 

1 

1 

r p CYjJ 


generates polynomials multiplied by an exponential function exp (—v 2 /2) . Pulling 
out this exponential factor, you end up with so-called Hermite polynomials Ei n {x) 
defined as 


W„» = exp (y)(i-|) [“p(-y)_ 


so that the oscillator’s wave function takes the form 

1 / ^2 


<Pn(x) = 




exp 


(4H- 


(7.44) 


Hermitian polynomials are well known in mathematical physics and can be 
computed from the following somewhat simpler expression: 

Unix) = (-l)" (7-45) 

The properties of these polynomials are well documented (google it!), so I will only 
emphasize one point: these polynomials and, therefore, the entire wave function 
have a definite parity—it is even for n = 0, 2 , 4 • • •, and it is odd for n = 1 , 3 , 5 • • • . 
Obviously this fact is the result of the symmetry of the harmonic oscillator potential 
with respect to inversion and is an agreement with our previous discussions of the 
connection between this symmetry and the parity of the quantum states. Figure 7.2 
presents graphs of wave functions representing states with n = 0,1,2,3, from 
which you can see that another general rule is also fulfilled here: the number of 
zeroes of the wave function coincides with the number of the respective energy level 
n. Note that n is counted here starting from zero; therefore, the number of zeroes of 
the wave function is n instead of n — 1 . 

Coordinate representation is, obviously, not the only possible way to present 
eigenvectors of the harmonic oscillator. As a second example, I want to discuss 
a representation based on eigenvectors of the Hamiltonian | n). The eigenvectors 
themselves in this representation are presented, as all basis vectors, by columns 
with a single entry, equal to unity, in the row corresponding to the number of the 
respective basis vector. The Hamiltonian in this basis is presented by a diagonal 
matrix H nm = E n 8 nm , where E n are energy eigenvalues given by Eq. 7.36. Less 
trivial is the representation of coordinate and momentum operators, and to find it I 
must first compute the matrix elements of the lowering (or raising—does not really 
matter) operator, a mn = (m\a\n): 


a mn = (m\a\n) = y/n {m\ n - 1) = *Jn8 m , n -\, 
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Fig. 7.2 Wave functions representing states of harmonic oscillators with n = 0 (upper left graph), 
n = 1 (upper right graph), n = 2 (lower left graph), and n = 3 (lower right graph) 


where I used Eq. 7.40 and the fact that all eigenvectors | n) are orthonormal. To 
visualize this matrix correctly, it is important to remember that index n in Eq. 7.40 
starts counting from zero, and it is convenient to keep it this way when numerating 
matrix elements. In this case the first row is given by ao n , second by and so 
on. Respectively, the first column is given by a m q. Non-zero elements in matrix a mn 
are characterized by column index exceeding the respective row index by one, i.e., 
0o,i> <zi, 2 » etc.—they go parallel to the main diagonal but one element above it: 


&mn 


~oV\ o o 

0 0 V2 0 ••• 

0 0 0 >/3 ••• 


.0 0 0 0 


(7.46) 


The matrix for the raising operator 

a rn n — ( m l ^ \ n ) — V >n + 1 (m\ 


n + 1 } — Vn^l8 m ,n+i 
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is obtained from Eq. 7.46 by simple transposition: 

' 0 0 0 0 •••' 

VT o o o ••• 



0 0 0 0 '-. 


(7.47) 


Obtaining matrices for coordinate and momentum operators is now as easy as 
adding two matrices. Using Eqs. 7.25 and 7.26, 1 find 


Pmn — ^ 


2 m P co 


' o VT o o 
VT o V 2 o 
0 V2 0 V3 


0 0 0 0 


m e fico 


' o -VT o o 
VT o — V 2 o 
0 V2 0 -V3 


0 0 0 0 


(7.48) 


(7.49) 


Both matrices are obviously Hermitian, but for the matrix representing the momen¬ 
tum operator, one must remember to do complex conjugation in addition to matrix 
transposition. 


7.1.2 Dynamics of Quantum Harmonic Oscillator 

When talking (or thinking) about a harmonic oscillator, we are intuitively look¬ 
ing for a quantity that changes periodically with time—oscillates. However, the 
stationary states, which I presented to you in the preceding section, are not very 
helpful in satisfying our intuitive subconscious desire to see a pendulum or at least 
something oscillating. Stationary states even though they have non-zero energies 
associated with them do not describe any dynamics and any physically relevant 
time dependence. Any expectation values computed with stationary states are time- 
independent, and those for coordinate and momentum are zeroes, not only for 
the ground state but for any stationary state. This is, of course, obvious from the 
coordinate and momentum matrices presented in Eqs. 7.48 and 7.49, but one can 
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also make a symmetry-based argument explaining this result. Even though this is 
a detour from the main goal of this section, I will take it because symmetry-based 
arguments are important in many areas of quantum mechanics and also because they 
are cool. 

The Hamiltonian of the harmonic oscillator is invariant with respect to the 
inversion operator II (see Sect. 6.2.1), and, therefore, its eigenvectors have a definite 
parity, as it was already mentioned. Any expectation value involves a bra and ket 
pair of them, and, therefore, either they are odd or even, their overall contribution 
is invariant with respect to II (a product of two odd functions is even). At the 
same time, coordinate and momentum operators are odd with respect to parity 
transformation: n _1 vll = — x, Tl~ l pTl = —p, as it was shown in Sect. 5.1.2. Thus, 
on one hand, expectation values are supposed to change sign upon inversion, but 
on the other hand, they must not because they represent a property of the system 
invariant with respect to inversion. Thus, ponder this: you have a situation when 
a quantity must simultaneously change its sign while remaining the same. Clearly, 
there is only one quantity capable of this Houdini trick, and it is the great invention 
of Hindu mathematicians—the zero. 

Now, back to the main topics. It is clear that the only way a quantum harmonic 
oscillator can actually oscillate is by being in a nonstationary state. We have dis¬ 
cussed two approaches to dealing with nonstationary phenomena—the Schrodinger 
picture (operators are time-independent, state vectors are time-dependent) and the 
Heisenberg picture (operators depend on time, and state vectors do not). I will 
treat the dynamics of the harmonic oscillator using both pictures beginning with 
the Heisenberg approach. 

Heisenberg equations 4.24 can be derived for any operator, and in Sect. 4.2 I 
did that for the momentum and coordinate operators. Equation 4.33 provides you 
with the complete solution of the respective Heisenberg equations and essentially 
with all what you might need to describe the dynamics of any experimentally 
relevant quantity. However, I would like to revisit the problem of finding time- 
dependent position and momentum operators, but this time I will do it by solving 
the Heisenberg equations for lowering and raising operators. The corresponding 
equations are 



The expression for the Hamiltonian in terms of Heisenberg operators a H , a^ H is the 
same as in terms of Schrodinger operators: 

ei kl He~i kt = H = hco (ei k ‘a f e~^ kt e^ kt ae ^//) 
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and, therefore, all the commutation relations, which we calculated for Schrodinger 
operators, remain the same. In particular, using Eqs. 7.30 and 7.31, I can find 


I a H , H 


hco 


a H ,N 




/] = hco[ 


a]j,N 


-flood 


t 


H' 


■ ficoa H , and 

Substituting it in the Heisenberg equations, I obtain the following nice-looking 
equations: 


dan 

dt 

= — icodn 

(7.50) 

dcd H 

dt 

= icoa H . 

(7.51) 


Unlike equations for the momentum and coordinate operators, Eqs. 7.50 and 7.51 
are not coupled, so they can be solved independently—in that they have a striking 
resemblance to classical Eqs. 7.14 and 7.13. Solutions to these equations are easy to 
write: 


= A?_ = At 


an = ae 


= ae 


(7.52) 


where a and a ^ are Schrodinger operators that play the role of initial conditions 
for the Heisenberg equations. Equations 7.25 and 7.26 are obviously valid for 
Heisenberg operators as well so that one can obtain for time-dependent coordinate 
and momentum operators: 



Using the Euler relation for the exponential functions, I can rewrite this result in the 
form previously derived in Eq. 4.33: 


x H 

Ph 


ft 


2 m e co 


[(a -b 2^) cos cot + i ( a ^ — a) sin cot ] 


i y ^ [( o' — a ) cos cot + i (a + a sin cot \, 


(7.55) 

(7.56) 


which agrees with Eq. 4.33 after one recognizes that at t = 0 these equations 
reproduce coordinate and momentum operators in the Schrodinger representation. 
Either of Eqs. 7.53-7.56 can be used, for instance, to compute the expectation values 
of coordinate and momentum for an arbitrary initial state |/o)- This task can be 
facilitated by using the basis of the eigenvectors | n) to represent this state: 
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oo 

l/o> = E Cn 

n =0 


(7.57) 


It is a bit more convenient to carry out these calculations using the exponential form 
of time dependence as in Eqs. 7.53 and 7.54: 


<*> = 


2m P (o 


e~ iM E 4 C » M a I") + ek0t E £ i c > H I") 


n,m=0 


n,m=0 


2m P co 


oo 


oo 


fl 


2 m P co 


E C lfn^Kn-X +e il0t E 4c«V^TTS m ,„+l 

n,m=0 n,m=0 

oo oo 

ot 22 c* c m+ i vWTT + e ,a,! c* + 1 c m vWTT 


m=0 


m =0 


(7.58) 


where I used previously derived matrix elements for the lowering and raising oper¬ 
ators. Now, we are getting something familiar: the expectation value of coordinate 
does indeed oscillate with the frequency of the harmonic oscillator co , and what is 
interesting, this behavior does not depend on the actual initial state, as long as it has 
contributions from at least two adjacent stationary states so that both c m and c m + \ are 
different from zeroes. This requirement, of course, excludes initial stationary states, 
which would have only one nonvanishing coefficient c m , as well as nonstationary 
states with a definite parity, which would contain only coefficients c m with either 
odd or even m. With a bit of imagination, you can recognize in Eq. 7.58 the typical 
for a classical harmonic oscillator behavior which can be described as 

(x) = A cos (cot + cj )), (7.59) 

where the amplitude and phase of the oscillations are determined by the initial 
conditions (as they also are in the classical case): 


y>;c m+ iVrn + 1 ; 

m= 0 

-- 

Re [E,„=o c *n+\ c ", s/m+ \] 

Similar calculations for the momentum operator produce 



(7.60) 


(P) 



OO 

^‘E 

m =0 


c* m+x c m Vm + 1 


OO 

- e~ mt 22 c m c m +1 Vm +T 

m =0 


(7.61) 
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which can be rewritten using the same amplitude and phase as 

(p) = —m e coA sin (cot + 0) (7.62) 

in full agreement with the Ehrenfest theorem, Eq. 4.17. 

Before shifting attention to the Schrodinger picture, let me consider a few more 
examples of the application of the Heisenberg equations. 

Example 18 (Uncertainties of Coordinate and Momentum of a Harmonic Oscilla¬ 
tor) Assume that the harmonic oscillator is initially in a state described by an equal 
superposition of its ground and the first excited states: 

K> = d=(|o) + |i». 

Compute uncertainties of the coordinate and momentum operators at an arbitrary 
time t and demonstrate, using the Heisenberg picture, that the uncertainty relation is 
fulfilled at all times. 

Using Eqs. 7.59, 7.62, and 7.60 with c 0 = c\ = 1 /\fl and c m = 0 for m > 1,1 
find for the expectation values 


M = ~ 


l / h 


2 V 2 m P co 


cos cot 


1 ficom P 


sin cot. 


<p> = - 5 

To find the uncertainties, I first have to compute ( p 2 ) and (v 2 ). I begin by computing 
~ fi 


2 m e co 


(de~ i(0t + tie™*) 


ti 


2 m P co 


[ a 2 e 2mt + aa* + cda + 


m e fico 


P =-~ 


m.fico 


(a 2 e~ 2iwt - ntf - 


(a ?) 2 c 2,cu/ ) , 

f = 

aa 1 — a'a + ( a ^) 2 e 2lwt \ . 


ae 


icot _ tf e koty = 


Now, remembering that a 10) = 0, a|l) = |0), a \2) = \fl |1), o' |0) = |1), 
at|l) = V2\2),ai\2) = V3|3),Iget 

x 2 |or 0 > = —f -(|0) + V2e 2Uot |2> + 2 11) + 11) + V6e 2kot |3>) , 

2V2 m e co ' ' 

P 2 l«o> = (- |0> + V2e 2iM |2) — 2 11) — 11) + V6e 2kot |3)) , 
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and, finally, 


HI * 2 |or 0 > 
HI P 2 H) 

Now I can find the uncertainties: 


fi 

4 m e co 


(1 + 3) = 


m e co 


m e fico 

4 


(1 + 3) = m e fico. 


Ax = t/ {x 2 } - (x) 2 = y 

1 * ( 

m e co \ 

1-cos 2 cot i 

. 8 ) 

Ap = 7( p 2 ) - {p) 2 = y 

^m e fico 

(1 — - sin 2 cot ) 

V 8 ) 


fl 1 o 

AxAp = —— J1 -|-sin 2 loot > 0.93 fi 

2V2 V 32 


in agreement with the uncertainty principle. 

There also exists an alternative approach to computing time-dependent averages 
of various observables using the Heisenberg picture, which allows to establish their 
dependence of time in a more generic way. To develop such an approach, let me 
first rewrite Eqs. 7.55 and 7.56 using Eqs. 7.25 and 7.26 for Schrodinger versions of 
the coordinate and momentum operators, which I will designate here as xo and po to 
emphasize the fact that they serve as initial values for the Heisenberg equations: 

xh = xq cos cot H- po sin cot (7.63) 

m e co 

Ph = Po cos cot — m e cox o sin cot. (7.64) 


Now, let’s say I want to compute the uncertainty of the coordinate for an arbitrary 
state |o'). The expectation values of the coordinate and momentum in this state, 
using Eqs. 7.63 and 7.64, can be written as 


(x) = (xo) cos cot H- (po) sin cot 

m e co 

( p ) = (po) cos cot — m e co (vo) sin&tf, 

where (xo) and (po) are time-independent “Schrodinger” expectation values that can 
be computed for a given state using any of the representations for the Schrodinger 
coordinate and momentum operators. Similarly, I can find for the expectation values 
of the squared operators 
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I* 2 ) = (*o) cos 2 + -((x 0 po) + {Poxo)) sin loot H—(p^) sin 2 

Z^TYlgCO 771 e CO 

iff 2 ) = ipl) cos 2 cot — -m e co ((xoPo) + (po*o)) sin 2 cot + m 2 co 2 (xq) sin 2 tur 
which will yield the following for the uncertainties: 

(Ax) 2 = (Axo) 2 cos 2 cot H- (Apo) 2 sin 2 cot+ 

mjco z 

— [((*oA)) + (£ 0 * 0 ) - 2 (x 0 ) (p 0 »] sin2tur 
2m e co 

(Ap ) 2 = (Apo) 2 cos 2 tur + m 2 tD 2 (Ax 0 ) 2 sin 2 cor— 

[((x 0 ^o) + (Poxo) - 2 (x 0 ) (p 0 ))] sin2<wr. 

I already mentioned it once, but it is worth emphasizing again: all expectation values 
in this expression refer to Schrodinger operators and can be computed using any 
of the representations for the latter. Let me illustrate this point by considering the 
following example. 

Example 19 (Harmonic Oscillator with Shifted Minimum of the Potential) Consider 
a harmonic oscillator with mass m e and frequency co in the ground state. Suddenly, 
without disruption of the oscillator’s state, the minimum of the potential shifts by 
d along the axes of oscillations and the stiffness of the potential changes such that 
it is now characterized by a new classical frequency Q. Find the expectation value 
and uncertainty of coordinate and momentum of the electron in the potential with 
the new position of its minimum. 

It is convenient to solve this problem using coordinate representation for the 
initial state and for the Schrodinger operators x and p. First of all, let’s agree to place 
the origin of the X-axis at the new position of the minimum. Then, the initial wave 
function, which is the ground state wave function of the oscillator with potential in 
the original position, is 

, x / m e co\ V 4 / m e co , x9 \ 

= exp (-a- <T + ‘°)' 

where x is counted from the new position of the potential. The expectation values of 
the Schrodinger operators (xq) and (po) are 


oo 



—oo 
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oo 

lm e (o f / m e Q) 2 \ 

V^r J = 

— OO 

where I made a substitution of variables and took into account that the wave function 
of the initial state is even. Similarly, I can find 

oo 

«*»>=/IF / exp (“i? <x+ d)1 ) x 

—oo 

Jr / m e co , x9 \"i 

oo 

, [m^com e co f / , tx9 \ , . 

ifi J —-—— / exp ^-— (x + J) 2 j (x + d) dx = 0. 

—oo 

Thus, I have for the time-dependent expectation values 

(x) = —d cos Qt 
(p) = m e Qd sin Qt. 

(Obviously, the dynamics of raising and lowering operators is defined by new 
frequency Q.) In order to find the respective uncertainties, I need to compute 
Axo, Apo, and (xqPq). The uncertainties of the regular Schrodinger coordinate and 
momentum operators do not depend on the position of the potential minimum with 
respect to the origin of the coordinate axes, so I can simply recycle the results from 
Eqs. 7.34 and 7.35: 


Ax 0 = 


ti 


Apo = 


fim e co 


2 m e a) 

For the last expectation value, (xofio), I will actually have to do some work: 
(xoPo) = -ifi ' x 

OO 

/ ,exp <x+< ° 2 ) i [ exp <x+ dx = 


OO 

, /m e (i)\ !/ 2 /m e co\ f , / m e co . T 

ifi ^—— J y—— j / x (x + d) exp ^-— (x + d) 2 j dx ■ 
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oo 

/m e co\ l /2 /m e (o\ f ( m e (o 2 \ 

■H^r) \nr) J 

—OO 



In the last line of this expression, I took into account that the integral with the 
linear in x factor vanishes because of the oddness of the integrand, while the integral 
containing x 2 together with the normalization factor of the wave function reproduces 
the uncertainty of the coordinate (Axo) 2 . If you are spooked by the imaginary result 
here, you shouldn’t. Operator v 0 /?o is not Hermitian, and its expectation value does 
not have to be real. To complete this calculation, I would have to compute (po*o)> 
but I will save us some time and use the canonical commutation relation [.x, p\ = ifi 
to find 


{xoPo) + {poxo) = 2 {xopo) ~ ifi = 0. 


Oops, so much efforts to get zero in the end? Feeling disappointed and a bit cheated? 
Well, you should be, because we could have guessed that the answer here is zero 
without any calculations. Indeed, the momentum operator contains imaginary unity 
in it, and with the wave function being completely real, this imaginary factor is not 
going anywhere. But, on the other hand, the result must be real because xopo + poXo 
is a Hermitian operator. So, the only conclusion a reasonable person can draw from 
this conundrum is that the result must be zero. Thus, we finally have for the time- 
dependent uncertainties: 


(Ax) 2 = 


fi 


(A pf = 


2 m e co 
tim.co 


cos 2 Qt 


1 fim P co 0 _ fi 
sin 2 £lt = 


cos Qt H-- sin Qt 

{ Q 2 


m 2 Q 2 2 


■ cos cot + m p Q 


,2o2 


2 m e co 


sm cot = 


2 m e co 
tim.co i 


Q 2 

cos 2 Qt H-- sin 2 Qt 

co 2 




and for their product 


(Ax) 2 (A^) 2 = - 


cos 4 Qt + sin 4 Qt + ( —- H-- ) cos 2 Qt sin 2 Qt 


4 


cos 4 Qt + sin 4 Qt -b 2 cos 2 Qt sin 2 Qt + 

2 02 


fco 2 Q 2 \ 

(co 2 Q. 2 \ 


cos 2 Qt sin 2 Qt 


4 


1 + 


(or ^2 2 \ 

{v- + ^~ 2 ) 


cos 2 Qt sin 2 Qt 


where in the second line I added and subtracted term 2 cos 2 Qt sin 2 Qt and 
in the third line used identity cos 4 Qt + sin 4 Qt + 2 cos 2 Qt sin 2 Qt = 
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(cos 2 Sit + sin 2 Sit) 2 = 1. Function y + l/y, which appears in the final result, 
has a minimum at y = 1 (SI = to), at which point the product of the uncertainties 
becomes fi 1 / 4. For all other relations between the two frequencies, the product of 
the uncertainties exceeds this value in full agreement with the uncertainty principle. 
It is interesting to note that as the uncertainties oscillate, their product returns to its 
minimum value at times t n = nn/(2Sl). 

In the Schrodinger picture, the dynamics of quantum systems is described by 
the time dependence of the vectors representing quantum states. For the initial state 
given by Eq. 7.57, the time-dependent state can be presented as (see Eq. 4.15) 

oo 

l*(0> = I>»*~ to(,,+1/2) l«> • (7-65) 

n =0 

Computing the expectation value of the coordinate with this state and using again the 
representation of the coordinate operator in terms of lowering and raising operators, 
I have 


(x) = 


2 m e co 


oo oo 


0 m| a |n) + 


c m c n e 

_n=0 m =0 

oo oo 


J2J2 c * m c - eUO(m ~ n>t M« T I n) 


c *r- Jco(m-n)t /m | -f | 

n= 0 m =0 

oo oo 


e i<o(m-n) t^ mn _ x + 


,n =0 m =0 


2 m e (D 
oo oc 

J2 £ c* m Cne Mm - n)t V^TlS m , n+l 


2 m e co 


n= 0 m =0 
oo 


lC0t c m c m +1 VrnTl + e mt c* + x c m \J m + 1 


m =0 


m =0 


(7.66) 


in full agreement with Eq. 7.61 obtained using the Heisenberg representation. What 
is interesting about this result is that in the beginning of the computations, we had 
complex exponential functions with all frequencies co (m — n). However, after the 
matrix elements of the lowering and raising operators have been taken into account, 
only terms with a single frequency co survived. In the Heisenberg approach, frequen¬ 
cies co (m — ri) never appear because the properties of a and eft are incorporated from 
the very beginning at the level of the Heisenberg equations. Similar expressions can 
be easily derived for the expectation values of the momentum operator: 

oo oo 

e iM J2 4 +1 c m V^TT - J2 C >m +1 VrnTT , 

m =0 m =0 



(p) = i 
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while generic expressions for the uncertainties of the coordinate and momentum 
operators in the Schrodinger picture are much more cumbersome and are more 
difficult to derive. Thus, I will illustrate the derivation of the uncertainties for the 
time-dependent states in the Schrodinger picture with the same example 18, which 
was previously solved in the Heisenberg picture. 

Example 20 (Uncertainties of the Coordinate and Momentum of the Quantum 
Harmonic Oscillator in the Schrodinger Picture) Let me remind you that we are 
dealing with a harmonic oscillator prepared in a state 

i« 0 > = 4(i°) + i i »’ 


and we want to compute the uncertainties of the coordinate and momentum 
operators at an arbitrary time t using the Schrodinger picture. 

Comparing the expression for the initial state with Eq. 7.66, expansion coeffi¬ 
cients c n in Eq. 7.65 can be identified as c 0 = c\ = 1/V2 while all other coefficients 
vanish. Thus, the time-dependent state vector now becomes 


W0) = vf 


exp 


(-L'j 10} + exp |1) 


The expectation values are immediately found from Eq. 7.66 to be as before 


(X) = r 


2 V 2 m P co 


cos cot', 


(P) = —2 


1 ficom e 


sin cot. 


To find the uncertainties, I need 


v 2 = 


2 m e co 
ficom, 


(a 2 + (cf) 

- (a 2 + (S 1 *) 


2 + da f + cdci\ 


da* — (fa 


)• 


Using again the properties of the lowering and raising operators, I find 


x 2 1/(0) = 


ivL ( v/5exp ("5"') |2)+exp H"') io)+ 

\/6exp -twM 13) + 3exp -cot\ |1). 
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Now, premultiplying this result by (/(f) | and using orthogonality of the eigenvec¬ 
tors, I find 


in complete agreement with the results obtained in the Heisenberg picture. I will 
leave computing of the result for the momentum operator to you. 


7.2 Isotropic Three-Dimensional Harmonic Oscillator 

Using the concept of normal coordinates, any three-dimensional (or even multi¬ 
particle) harmonic oscillator can be reduced to the collection of one-dimensional 
oscillators with total Hamiltonian being the sum of one-dimensional Hamiltonians. 
The spectrum of eigenvalues in this case is obtained by simply summing up the 
eigenvalues of each one-dimensional component, and the respective eigenvectors 
are obtained as direct product of one-dimensional eigenvectors. To illustrate this 
point, consider a Hamiltonian of the form 

^2 ^2 

H = + 2 ^- + 2 ~- + \ ( m ex^x 1 + nteyCOyf + m ez co 2 z 2 ) (7.67) 

= H x + H y + H z . 

I can define a state characterized by three quantum numbers \n x ,n y , n z ) which can 
be considered as a “product” of the one-dimensional eigenvectors defined in the 
previous section | n x , n y , n z ) = \n x ) | n y ) \n z ), where the last notation does not presume 
any kind of actual “multiplication” but just serves as a reminder that the v-dependent 
part of the Hamiltonian 7.67 acts only on the | n x ) portion of the eigenvector, the H y 
acts only on 1^), and so on. Thus, as a result, I have 

(H x + H y +H^\n x )\n y )\n z ) = 

Wx) | ny) | n z ), 

where n XJ ^ z independently take integer values starting from zero. The position 
representation of the eigenvectors is obtained as 

(Pn x ,n y ,n z (x,y,z) = ( x,y,z \n x ,n y ,n z ) = (x \n x ) (y \n y ) (z \ n z ) = 
<Pn x (x)<Pn y (y)(Pn z (z), 

where each (p ni {ji ) is given by Eq. 7.44. 


flCOj 


( Ux + 2 ^ + htoy yn y + + fuo z yn z + 







7.2 Isotropic Three-Dimensional Harmonic Oscillator 


229 


In the most general case, when the parameters in H x ,H y , and H z are all different, 
we end up with distinct eigenvalues characterized by three independent integers. 
The energy of the ground state is characterized by n x = n y = n z = 0 and is given 
by £o,o,o = (w x + co y + co z ). 

If, however, all masses and all three frequencies are equal to each other so that 
the Hamiltonian becomes 


P 2 X +Py +P 2 Z 


1 


H = ' J (. x 2 + y 2 + z 2 ), 

2 m P 2 v 7 


(7.68) 


a new phenomenon emerges. The energy eigenvalues are now given by 


En x ,n y ,n z — ^y 

and it takes the same values for different eigenvectors as long as respective indexes 
obey condition n = n x + n y + n z . In other words, the eigenvalues in this case become 
degenerate—several distinct vectors belong to the same eigenvalue. The number of 
degenerate eigenvectors is relatively easy to compute: for each n you can choose n x 
to be anything between 0 and n , and once n x is chosen, n y can be anything between 
0 and n — n x , so there are n — n x + 1 choices. Once n x and n y are determined, the 
remaining quantum number n z becomes uniquely defined. Thus, the total number of 
choices of n x and n y for any given n can be found as 


n x =n 

7>- % + 1) = (k+ 1) (n+l)- n(n + l)/2 = (n + 1 ){n + 2)/2. 

n x =0 

This degeneracy can be easily traced to the symmetry of the system, which has 
emerged once I made the parameters of the oscillator independent of the direction. 


7.2.1 Isotropic Oscillator in Spherical Coordinates 


Even though we already know the solution to the problem of an isotropic harmonic 
oscillator, it is instructive to reconsider it by working in the position representation 
and using the spherical coordinate system instead of the Cartesian one. The position 
representation of the Hamiltonian in this case becomes 


H = — 


fP_ 

2 M e 


V 2 


ft 2 9 ( r 2 9 A 

2 m e r 2 dr y dr) 


+ -m e co 2 r 2 


L 2 1 2 2 

H— m e co r , 


2 m e r 2 2 


(7.69) 
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where in the second line I used Eq. 5.62 representing Laplacian operator in terms of 
the radial coordinate r and operator L 2 . It is obvious that the Hamiltonian commutes 
with both L 2 and L z so that the eigenvectors of the Hamiltonian in the position 
representation can be written as 


0, <p) = y r ( e , <p) R nr ,i(r )• (7.70) 

Substituting Eq. 7.70 into the time-independent Schrodinger equation 

Hf = E\If 

with Hamiltonian given by Eq. 7.69, you can derive for the radial function R Ur ,i(r ): 


fi 2 d f 2 dR nrJ \ , ti 2 l(l+\) n ,1 22 „ _ r „ „„„ 

2 m^drV dr ) + 2m^ n ' 1 + 2^ ' W “ E ln r Rn r ,i. (7.71) 


It is convenient to introduce an auxiliary function u Hr j(r) = rR nr j, which, when 
inserted into the radial equation above, turns it into 

fl 2 d 2 U n l fl 2 l(l +1) 1 99 

- 2m dr 2 4 - 2m r 2 UnrJ + 2^ Unr,t = El ^ Un ^ h (7.72) 

Equation 7.72 looks exactly like a one-dimensional Schrodinger equation with 
effective potential: 

fl 2 l(l + 1 ) 1 99 

V * = ^- + 2 ^’ 

The plot of this potential (Fig. 7.3) shows that it possesses a minimum 

v™ 1 = hco^WT i) 


Fig. 7.3 The schematic of 
the effective potential for the 
radial Schrodinger equation 
for isotropic 3-D harmonic 
oscillator in arbitrary units 
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at 


r mm ~ —— VW + !)• 

m e <j) 

(Of course, you do not need to plot this function to know that it has a minimum— 
just compute the derivative and find its zero.) For any given /, the allowed values 
of energy obeying inequality E > + 1) correspond to the classical bound 

motion; thus all energy levels in this effective potential are discrete (which, of 
course, is nobody’s surprise, but still is a nice fact to know). States with / = 0 are 
described by the Schrodinger equation, which is the exact replica of the equation 
for the one-dimensional oscillator. You, however, should not rush to pull out of 
the drawer old dusty solutions of the one-dimensional problem (OK, not that old 
and dusty, but still). A huge difference with the purely one-dimensional case is the 
fact that the domain of the radial coordinate r is [0, oo], unlike the domain of a 
coordinate in the one-dimensional problem, which is [—oo, oo]. Consequently, the 
wave function u Ur j must obey a boundary condition at r = 0. Given that the actual 
radial function R Ur j must remain finite at r = 0, it is clear that u Urt i( 0)=0. Now you 
can go ahead, brush the dust from the solutions of the one-dimensional harmonic 
oscillator problem, and see which fit this requirement. A bit of examination reveals 
that we have to throw out all even solutions with quantum numbers 0,2,4--- 
which do not satisfy the boundary condition at the origin. At the same time, 
all odd solutions, characterized by quantum numbers 1,3, 5 • • •, satisfy both the 
Schrodinger equation and the newly minted boundary condition at r = 0, so they 
(restricted to the positive values of the coordinate) do represent eigenvectors of the 
isotropic oscillator with zero angular momentum. 

Solving this problem with l > 0 requires a bit more work. To make it somewhat 
easier to follow, I will begin by introducing a dimensionless radial coordinate 
g = r/£, where £ = yjfi/m e (D is the same length scale that was used in the one¬ 
dimensional problem. The Schrodinger equation rewritten in this variable becomes 


fico d 2 u Ur j 
~2 dg 2 


d 2 U nr i 

~d^ 


hcol(! +1) 1 2 

H" 2^.2 ^n r ,l H" 2 ^^^" Wn r ,l — Ei,n r Un r ,l 


/(/ + 1 ) 


Wn r ,l £ M nr ,l + ^l,n r ^n r ,l — 0 


where I introduced dimensionless energy 


(7.73) 


^l,n r — . 

The resulting differential equation obviously does not have simple solutions 
expressible in terms of elementary functions. One of the approaches to solving 
it is to present a solution in the form of a power series J2 c jg j and search for 
unknown coefficients Cj. In principle, knowing these coefficients is equivalent to 
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knowing the entire function. Before attempting this approach, however, it would be 
wise to try to extract whatever information about the solution this equation might 
contain. For instance, you can ask about the solution’s behavior at very small and 
very large values of g. When g 1, the main term in Eq. 7.73 is the angular 
momentum contribution to the effective potential. Neglecting all other terms, you 
end up with an equation 


d 2 u nrJ 

dg 2 


/(/ + 1 ) 


^ n r ,l 


0 , 


which has a simple power solution 


u n r ,i = Ag l+l . 


(7.74) 


(7.75) 


You are welcome to plug it back in Eq. 7.74 and verify it by yourself. For those 
who think that I used magical divination to arrive at this result, I have disappointing 
news: Eq. 7.74 belongs to a well-known class of so-called homogeneous equations. 
This means that if I multiply g by an arbitrary constant factor A, the equation does 
not change (check it), with a consequence that if function u(g) is the solution, so 
is function u{Xg). Such equations are solved by power functions u a g Q , where 
power q is found by plugging this function into the equation. 

In the limit of large g 1, the main contribution to Eq. 7.73 comes from the 
harmonic potential. We know from solving the one-dimensional problem that the 
respective wave functions contain an exponential term exp (— x 2 /2) for x direction 
and similar terms for two other coordinates. When multiplying all these wave 
functions together to obtain a three-dimensional wave function, these exponential 
terms turn into exp (—£ 2 /2); thus it is natural to expect that the radial function u Hr j 
will contain such a factor as well. To verify this assumption, I am going to substitute 
exp (— g 2 /2) into Eq. 7.73 and see if it will satisfy the equation, at least in the limit 
g —> oo. Neglecting all terms small compared to g 2 ,1 find 

dg 2 ? ‘ 

Substituting this result in Eq. 7.73, and neglecting all terms except of the harmonic 
potential, I find that this function is, indeed, an asymptotically accurate solution 
of this equation. I want you to really appreciate this result: in order to reproduce 
exponential decay of the wave function, which, by the way, almost ensures its 
normalizability, using a power series, we would have to keep track of all the infinite 
number of terms in it, which is quite difficult if not outright impossible. By pulling 
out this exponential term as well as the power law for small g , you might entertain 
some hope that the remaining dependence on g is simple enough to be dug out. 

Thus, my next step is to present function u Hr j ( g ) as 

ui,n r (?) = Ag ,+1 exp (~g 2 /2) v,,„ r (g) 


(7.76) 
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and derive a differential equation for the remaining function v nr j (g). To this end, I 
first compute 


-^f = ^ [ (/ 1 1) ?'exp (-? 2 /2) tV/ (?) - 

?'« exp (-? J /2) (?) + ? ,+ [ exp (- s ! /2) = 

/()+ l)?' -1 exp (-? 2 /2)ixw (?)-() + 1)? ,+1 exp (—? 2 /2) (?) + 

(/+ l)?'exp(-? : /2) - (/ + 2)?' +l exp (—?"/2) iv,/<?) + 

?'+> exp (—? 2 /2) (?) - ?'« exp (-? 2 /2) ^4 

(/+ l)?'exp(-f ! /2) ^ -s'«exp(- s 2 /2) ^ + 
?' + 'exp(-;V2)^ = 

exp (-? 2 /2) ? /_1 u„ r (?) (Z(Z + 1) - ? 2 (2Z + 3) + ? 4 ) + 

?'exp (—? 2 /2) ^ (2Z + 2 - 2? 2 ) + ? ,+1 exp (-? 2 /2) d -^~. 

Frankly speaking, I did not have to torture you with these tedious calculations: such 
computational platforms as Mathematica or Maple work with symbolic expressions 
and can perform this computation faster and more reliably (and, yes, I did check my 
result against Mathematica’s). Substituting this expression to Eq. 7.73, I get (and 
here you are on your own, or you can try computer algebra to reproduce this result) 

<T^T + 2(Z+l-? 2 )^ + ?tV,(<T)(e/,„ r -2Z-3) = 0. (7.77) 

Now I can start solving this equation by presenting the unknown function v Hr j (g) 
as a power series and trying to find the corresponding coefficients: 

oo 

Vn,J (?) = ( 7 - 78 ) 

7=0 


The goal is to plug this expression into Eq. 7.77 and collect coefficients in front of 
equal powers of First, I blindly substitute the series into Eq. 7.77 and separate all 
sums with different powers of ^: 
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X c ij ( j - 1 + 2(/+o x c jjs j 1 - 2 X c //s J+1 + 

7=0 7=0 7=0 

oo 

(f/.n,. — 2/ — 3 ) X C /S’ ,+ l =0. 

7=0 

Combining the first two and last two sums, I get 

oo oo 

X; [j - 1 + 2/ + 2] c^' -1 + X i € i,Hr - 2/ - 3 - 2y] Cjg J+i = 0. 

7=0 7=0 

Next I notice that in the first sum, contributions from terms with j = 0 vanish, so 
that this sum starts with j = 1.1 can reset the count of the summation index back to 
zero by introducing new index k = j — 1, so that this sum becomes 

oo 

El Ck+ 1 (k + 1) (k + 21 + 2)s'*. 

k=0 


Renaming k back to j (this is a dummy index, so you can call it whatever you want, 
it does not care), we rewrite the previous equation as 

oo oo 

£0 + 1) [j + 2/ + 2] cj+ig j + E [ e i,n r —21 —3 — 2 j] = 0. 

7=0 7=0 

The first sum in this expression begins with g° term multiplied by coefficient c\. 
The second sum, however, begins with linear in g term and does not contain s' 0 at 
all. To satisfy the equation, coefficients in front of each power of g must vanish 
independently of each other, so we have to set c\ = 0. This makes the first sum 
again to start with j = 1. Utilizing the same trick as before, I am replacing j with 
7+1 while restarting count from new j = 0 again. The result is as follows: 

oo oo 

EG + 2) [j + 2/ + 3] Cj+2g j+l + E \- €l ’ n r —21 —3 — 2 j] qg j+l = 0. 

7=0 7=0 

Now I can, finally, combine the two sums and equate the resulting coefficient in 
front of G +1 to zero: 


G + 2) [j + 21 + 3] Cj +2 — [21 + 3 + 2 j — €i nr ] cj 
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or 


21 + 3 + 2j — 6 / >Wr 
(j + 2) [7 + 2/ + 3] 7 


(7.79) 


This is a so-called recursion relation, which allows computing all expansion 
coefficients recursively starting with the first one. It is important to note that Eq. 7.79 
connects only coefficients with indexes of the same parity: all coefficients with 
even indexes are expressed in terms of Co, and all coefficients with odd indexes 
are expressed in terms of c \. But, wait, have not we determined a few lines back that 
c i=0? Actually, we did determine that, and now, thanks to Eq. 7.79, 1 can establish 
that not only c\ but all coefficients with odd indexes are zeroes. So, it looks like I 
achieved the announced goal—finding all coefficients in the power series expansion 
of v nr j. Formally speaking, I did, indeed, but it is a bit too early to dance around 
the fire and celebrate. First, I still do not know what values of the dimensionless 
energy 6 / „ r correspond to the respective eigenvectors, and, second, I have to verify 
that the found solution is, indeed, normalizable. The last issue is not trivial because 
we are dealing with an infinite series here, so there are always questions about its 
convergence and the behavior of the function it represents. As I shall demonstrate 
now, both these questions are connected and will be answered together. 

Whether a function is normalizable or not is determined by its behavior for large 
values of its argument. I pulled out an exponentially decreasing factor from the 
solution hoping that it would be sufficient to guarantee normalization, but to be 
sure I need to consider the behavior of v Ur j at g —> 00 . Any finite number of 
terms in the expansion 7.78 cannot overcome the exponentially decreasing factor 
exp (—£ 2 / 2 ), so the anticipated danger can only come from the tail of the power 
series, i.e., from coefficients Cj with j —> 00 . In this limit the recursion relation 7.79 
can be simplified to 


Cj +2 « 



(7.80) 


which, when applied repeatedly, yields 


C 2j 0 +2N - 


2 1n 


1 


2 /o ( 2 /o + 2 ) • • • ( 2 /o + 2 N) 70 jo (jo + 1 ) * * * (jo + N) 70 


When writing this expression, I explicitly took into account that there are only even 
indexes, which can be presented as 2 j 0 + 2 k with the total number of recursive factors 
being 2 N. Even though this expression is only valid for j 0 1,1 can extend it to 
all values of jo because as I pointed out earlier, any finite number of terms in the 
power series would not affect its asymptotic behavior. That means that the large g 
behavior of the series in question is the same as that of the series: 
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Even after combining this result with exp (— g 2 /2) factor, which was pulled out 
earlier, I still end up with function u nr j ) behaving as exp (<T 2 /2) at infinity. What 
a bummer! It is disappointing, but not really surprising: it is easy to check that 
exp ($ 2 / 2 ) is the second possible asymptotic solution of Eq. 7.73, which I choose 
to discard because of its non-normalizable nature. Well, this is how it often is—you 
chase math out of the door, but it always comes back through the window to bite 
you. So, the question now is if there is anything I can do to save the normalizability 
of our solution. That light at the end of the tunnel will appear if you recall that 
any power series with a finite number of terms cannot overpower an exponentially 
decreasing function. Therefore, if I find a way to terminate the series at some finite 
number of terms, our conundrum will be resolved. To see how this is possible, let’s 
take another look at the recursion relation, Eq. 7.79. What if at some value of j, 
which I will call 2 n r to emphasize its evenness, the numerator of this relation turns 
zero? If this were to happen, then coefficient Cj mx + 2 would vanish and vanquish 
all subsequent coefficients as well, so that instead of an infinite series, I will end 
up with a finite sum. This will surely guarantee the normalizability of the found 
solution. The condition for the numerator to vanish reads as 

21 + 3 + 4 n r — 6/ >Wr = 0 

which is immediately recognizable as an equation for the dimensionless energy €^ nr ! 
While resolving the normalization problem, I just automatically solved finding the 
eigenvalue problem. Using 


£i,n r — 3 + 2(1 + 2n r ) 

as well as the relation between €^ nr and actual energy eigenvalues, I obtain 

Ei,n r — tio) + l + 2n r 
Thus, for each l and n r , you have an energy value and a respective wave function 

2 n r 

Ui,n r (?) = g l+l exp (-? 2 /2) y] Cjg j (7.81) 

l=o 

where coefficients Cj are given by Eq. 7.79. To get a better feeling for this result, 
consider a few special examples. 

1. n r = 0. In this case the sum in Eq. 7.81 contains a single term c 0 , so the non- 
normalized wave function becomes 

M/,o (?) = c„? ,+1 exp (—? 2 /2) 

with respective energy value £/o = hco (| + /). 
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2. n r = 1. Using Eq. 7.79 with = 3 + 2(1 + 2), I find for C 2 (substituting j = 0 
into Eq. 7.79): 


<?2 


2/ + 3 - (3 + 2/ + 4) 
2 [21 + 3] 


-c 0 


2/+ 3 


co 


so that 


M/,i (?) = c 0 g l+l exp (-? 2 /2) (l - • 

Following this pattern you can compute the wave functions belonging to any 
eigenvalue. For higher energy eigenvalues, it would take more time and efforts, 
of course, but you can always give this task to a computer. Before finishing this 
section, I would like to note that the energy eigenvalues depend only on the sum 
/ + 2 n r rather than on each of these quantum numbers separately. It makes sense, 
therefore, to introduce a main quantum number n = /+2 n r and use it to characterize 
energy values: 


E n = fid) 



(7.82) 


Then, the radial wave functions will be labeled by indexes / and n with a requirement 
n — l = 2n r > 0, while the total wave function includes spherical harmonics and an 
additional index m. In actual physical variables, it becomes 


= j (|) exp (-^0 E Cj (jj yr (0, cp) (7.83) 

where I reintroduced radial function R nr j = u nr j/r. This function is not normalized 
until the value of coefficient Co in its radial part is defined, but I am not going to 
bother you with that. Instead, I will compute the degree of degeneracy of an energy 
eigenvalue characterized by main number n , which is much more fun. Taking into 
account that for each l there are 2/ + 1 possible values of m, and that / runs all the 
way down from n in increments of 2 (n — l must remain an even number), the total 
number of states with given n is 

y ] (21 + 1) = (n + 1) + n(n + l)/2 = (n + 1) (n + 2) /2, 

where the summation over l is carried with increments of 2. It is a nice feeling 
to realize that this expression for degeneracy agrees with the one obtained using 
Cartesian coordinates. 
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The resulting expression for the wave function given by Eq. 7.83 is an alternative 
way to produce a position representation of the harmonic oscillator wave function 
and is quite remarkably different from the one obtained using Cartesian coordinates. 
One might wonder why it is at all possible to have such distinct ways to represent a 
same eigenvector. After all, isn’t a representation, once chosen, supposed to provide 
a unique way to describe a quantum state? The matter of fact is that it is, indeed, 
so only if a corresponding eigenvalue is non-degenerate. In the degenerate case, one 
can form an infinite number of the linear combinations of the eigenvectors, and any 
one of them will realize the same representation of the corresponding state. In the 
case of isotropic harmonic oscillator, it means that the wave functions expressed 
in spherical coordinates can be presented as linear combinations of their Cartesian 
counterparts and vice versa. 


7.3 Quantization of Electromagnetic Field and Harmonic 
Oscillators 

7.3.1 Electromagnetic Field as a Harmonic Oscillator 

Even though the idea of photons—the quanta of electromagnetic field—was one 
of the first quantum ideas introduced into the conscience of physicists by Einstein 
in 1905, 1 the full quantum description of electromagnetic field turned out to 
be a rather difficult problem. The first serious attempt in developing quantum 
electrodynamics was undertaken by Paul Dirac in his famous 1927 paper, 2 which 
was just the beginning of a long and difficult path walked by too many brilliant 
physicists to be mentioned in this book. Here are just a few names of those who 
made critical theoretical contributions to this field: German-American Hans Bethe, 
Japanese Sin-Itiro Tomonaga, and Americans Julian Schwinger, Richard Feynman, 
and Freeman Dyson. Quantum electrodynamics is a difficult subject addressed in 
multiple specialized books and is beyond the scope of this text. Nevertheless, I 
would love to scratch a bit from the surface of this field and demonstrate how 
ideas developed in the course of studying the harmonic oscillator emerge in new 
and unexpected places. 


'The irony is that an explanation of photoelectric effect did not require the quantization of light 
despite what you might have read or heard. All experimental data could have been explained 
treating light classically while describing electrons in metals by the Schrodinger equation. 
Fortunately Einstein did not have the Schrodinger equation in 1905 and couldn’t know that. The 
science does evolve in mysterious ways: Einstein’s erroneous idea about the photoelectric effect 
inspired de Broglie and Schrodinger and brought about the Schrodinger equation, which could 
have been used to disprove the idea. Compton’s effect, on the other hand, can indeed be considered 
as a proof of reality of photons. 

2 P.A.M. Dirac, The quantum theory of the emission and absorption of radiation. Proc. R. Soc. 
Lond. 114, 243 (1927). 
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To this end, I propose considering a toy model of electromagnetic field, in which 
the field is described by single components of electric and magnetic fields: 

E x = aEo(t) sin kz (7.84) 

By = — -aBo(t) cos kz (7.85) 

c 

where I introduced a normalization coefficient a to be defined later; extra factor 
1 /c, where c is the speed of light in vacuum, in the formula for the magnetic field, 
ensures that amplitudes So and Bo have the same dimension (you might remember 
from the introductory course on electromagnetism relation E = cB between electric 
and magnetic fields in a plane wave), and the negative sign is included for future 
convenience. The Maxwell equations for the electromagnetic field in this simplified 
case take the form 


BE, = _BBy 
3 z 3 1 

3 By _ 1 3 E x 

3 z c 2 3 1 

Plugging in the expressions for electric and magnetic fields given by Eqs.7.84 
and 7.85, you will find that the spatial dependence chosen for the fields in these 
equations is indeed consistent with the Maxwell equations, which will be reduced 
to the system of ordinary differential equations: 

= coSo(t) (7.86) 

= -coB 0 (t). (7.87) 

Parameter co appearing in these equations is defined as co = ck. It is easy to see that 
amplitudes of both electric and magnetic fields obey the same differential equation 
as a harmonic oscillator. For instance, differentiating the first of these equations with 
respect to time and using the second equation to replace the time derivative of the 
electric field, you will get 


dBo 
dt 
d £o 
dt 


d 2 B 0 
dt 2 


+ q) 2 B 0 = 0 . 


Similar equation can be derived for So . You can also notice that Eqs. 7.86 and 7.87 
have some resemblance to the Hamiltonian equations of classical mechanics, 
and this may make you wonder if they can be derived from some kind of a 
Hamiltonian. If you are asking why on earth would I want to re-derive these 
equations from a Hamiltonian, you were not paying attention to the first 130 pages of 
the book. Hamiltonian formalism allows us to introduce canonical pairs of variables, 
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which we can turn into operators obeying canonical commutation relations; thus a 
Hamiltonian formulation is the key to turning classical theory of electromagnetic 
field into the quantum one. 

How would one go about introducing a Hamiltonian for the electromagnetic 
fields? Naturally, one starts by remembering that Hamiltonian is the energy of the 
system and that the energy of the electromagnetic field is given by 



v 


where integration is carried over the entire space occupied by the field. However, if 
you attempt to directly compute this integral using Eqs. 7.84 and 7.85 for electric 
and magnetic fields, you will encounter a problem: the integral is infinite. This 
happens because the field occupies the entire infinite space and does not decrease 
with distance. To fix the problem, I introduce a large but finite region of volume 
V = L z Sxy , where L z is the linear dimension of this region in z direction and S xy is 
the area of the limiting plane perpendicular to it, and assume that the field vanishes 
outside of this region. This trick is very popular in physics, and you will encounter 
it in different circumstances later in the book. It can be justified by noting that the 
notion of a field occupying the entire space is by itself quite artificial with no relation 
to reality. It is also natural to assume that the properties of the field far away from 
the region of actual interest should not affect any observable phenomena, so that we 
can choose them to be as convenient for us as possible. 

With this in mind, I can write the integral in Eq. 7.88 as 


n 


a 2 S 


*y 


\ *o£ { 


L, 

/ 


dz sin 2 kz + -—Bz 
2/io 


L , 

/ 


dz cos 2 kz 


-a 2 s 0 S xy L [£q +B%], 


where I assumed that k satisfies condition kL = nn, n = 1,2, • • •, making cos 2 kz = 
1 at both the lower and upper integration limits so that the respective terms cancel 
out. Also, at the last step, I made a substitution (/z 0 £o) _1 = c 2 . You might, of 
course, object to the artificial discretization of the wave number and imposition of 
the arbitrary conditions on the values of the electric and magnetic fields at z = L. 
So, what can I say in my defense? First, in the limit L —> oo, which I can make after 
everything is said and done, the discretization will disappear, and as you will see in 
a few short minutes, I will make the dependence on the volume which popped up in 
the last expression for the Hamiltonian, disappear as well. Second, I can invoke the 
same argument I just made when limiting the field to the finite volume: the behavior 
of the field in any finite region of space shall not be affected by its values at an 
infinitely remote plane. If you are still not convinced, I have my last line of defense: 
it works! 
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Now, I am ready to fix the normalization parameter a introduced in Eqs. 7.84 
and 7.85. For the reasons which will become clear later, I will choose it to be 

a = y/lcoj (e 0 V), (7.89) 

so that the final expression for the energy of the field becomes 

n=j (£l + Bl ). (7.90) 

Did you notice that the dependence on the volume in the Hamiltonian is gone? This 
is fiction of course, because I have simply hidden it inside formulas for the fields, 
but in all expressions concerned with actual physical observables, it will vanish in 
all honesty. 

Equation 7.90 looks very much like the Hamiltonian of a harmonic oscillator. 
The first term can be interpreted as kinetic energy with So playing the role of the 
canonical momentum and term 1/co replacing the mass, and the second term is an 
analog of the potential energy with Bo as a conjugated coordinate (note that the 
coefficient m e (o 2 /2 in the harmonic oscillator potential energy is reduced to co/2 
factor in Eq. 7.90 if you replace m e with 1 /co). If you wonder why I chose the electric 
field to represent the momentum and the magnetic field to be the coordinate, and not 
vice versa, just compare Eqs. 7.86 and 7.87 with Hamiltonian equations 7.8 and 7.7, 
paying attention to the placement of the negative sign in these equations. You can 
easily see that the Hamiltonian equations reproduce Eqs. 7.86 and 7.87 justifying 
this identification. But do not be fooled. Identifying magnetic field with coordinate 
and electric field with momentum is, of course, a matter of convention resulting 
from the choice to place the negative sign in Eq. 7.85. 

The Hamiltonian formulation of the classical Maxwell equations allows me now 
to introduce the quantum description of the fields. This is done by promoting So and 
Bo to operators with the standard canonical commutation relation: 

[B 0 ,£o]=ifi. (7.91) 

As a result, the classical Hamiltonian, Eq. 7.90, becomes a Hamiltonian operator: 

H = | (4 2 + 4 2 ) • (7-92) 

It is easy to see from Eq. 7.90 that both So and Bo have the dimension of 
V energy x time so that the dimension of the commutator on the left-hand side of 
Eq. 7.91 is energy x time , which coincides with the dimension of Planck’s constant, 
as it should. This result is not particularly surprising, of course, but it is always 
useful to check your dimensions once in a while just to make sure that your theory 
does not have any of the most basic problems. Using Eq. 7.91 together with Eqs. 7.84 
and 7.85, 1 can compute the commutator of the non-zero components of the electric 
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and magnetic fields, which, of course, are now also operators: 

r- - 1 ifico 

[B y ,E x J =-—-sin2fe. (7.93) 

One immediate consequence of this result is the uncertainty relation for these 
components: 


AB y AE x > |sin 2kz \, (7.94) 

ZSqcV 

which shows that just like the coordinate and momentum, electric and magnetic 
fields cannot both be known with certainty in the same quantum state. 

Canonical commutator, Eq. 7.91, also indicates that in the representation using 
eigenvectors of Bo as a basis, in which states are represented by wave functions 
dependent on the magnetic field amplitude Bo , the representation of the electric 
field amplitude operator £ 0 is 


So = —ifi 


3S 0 ’ 


while the Hamiltonian takes the form 



Comparing this expression with the quantum Hamiltonian of the harmonic oscillator 
in the coordinate representation, you can see that they are mathematically identical 
if again you replace m e with \/co. The wave functions representing eigenvectors of 
this Hamiltonian can be in this representation written down as 

•Pn(Bo) = 7 1 exp (--§-) H n ( (7.95) 

where the characteristic scale of the quantum fluctuations of the magnetic field, 
£ em , is determined solely by Planck’s constant % em = Vh (this result follows from 
Eq. 7.42 after the substitution m e = l/co). As with any wave function, | cp n (£> 0 )| 2 
determines the probability density function for the magnetic field amplitude. 

While it is interesting to see how one can turn the coordinate representation 
of the harmonic oscillator into the magnetic field representation of the quantum 
electromagnetic theory, the practical value of this representation is quite limited. 
Much more important, from both theoretical and practical points of view, is the 
opportunity to introduce electromagnetic analogs of lowering and raising operators. 
In order to distinguish these operators from those used in the harmonic oscillator 
problem, I will use notation b and IS (do not confuse these operators with variables 
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b used in the description of the classical oscillator), where 


b = 



(7.96) 

(7.97) 


Equations 7.96 and 7.97 are obtained from Eqs. 7.22 and 7.23 by setting m e oo = 1 
and replacing x and p by Bo and So correspondingly. Hamiltonian 7.92 expressed in 
terms of these operators acquires a familiar form: 


H = ha>(Pb+ 1/2) . 

All commutators, which were computed in Sect. 7.1.1, remain exactly the same, so 
I can simply reproduce the results from that section: the energy eigenvalues of the 
electromagnetic field are given again by 



while eigenvectors can be constructed from the ground state |0) as 

I n) = -E (^)" |0}. (7.99) 

Formally, both these results are exactly the same as in the case of the harmonic 
oscillator. However, the physical interpretation of the integer n in these expressions 
and, therefore, of both energy values and eigenvectors is completely different. 

Indeed, in the case of a harmonic oscillator, we have a material particle, which 
can be placed in states with different energies, counted by the integer n. The 
electromagnetic field, on the other hand, once created, carries a certain amount of 
energy, and the same field cannot be made to have “more” energy. To produce a 
field with higher energy, you need to increase its amplitude, i.e., add “more” field. 
The discrete nature of allowed energy levels tells us that the energy of the field can 
only be increased in finite increments: to go from a state of electromagnetic field 
with energy E n to the state with energy E n + 1 , you have to add a discrete “quantum” 
of field with energy fico. This discrete energy quantum is what was introduced by 
Einstein in 1905 as “das Lichtquantas.” Replacing the term “quantum of light” 
with the term “photon,” 3 you can say that number n is the number of photons in 
a given state and that going from state \ri) to state \n + 1) amounts to generating 


3 It is interesting that the term “photon” was used for the first time in an obscure paper by an 
American chemist Gilbert Lewis in 1926. His paper is forgotten, but the term he coined lives on. 
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or creating an extra photon, while transitioning to state \n—\) means removing 
or annihilating a photon. To emphasize this point, operators m and b are called in 
the context of quantum electromagnetic field theory “creation” and “annihilation” 
operators, respectively, rather than lowering and raising operators. The ground state 
|0) in this interpretation is the state with zero photons and is called, therefore, the 
vacuum state. A counterintuitive aspect of the vacuum state is that even though 
it is devoid of photons, it still has non-zero energy, which in our oversimplified 
model is just fico/ 2. To one’s mind it might appear as a nonsensical result: how 
can zero photons have non-zero energy? I hope it will not blow your mind away 
if I say that in a more complete theory, which takes into account multiple modes 
(waves with different wave vectors k) of electromagnetic field, the “vacuum” energy 
might become formally infinite. In order to wrap your mind around this weird result, 
consider the following. 

The photon is not just “a quantum of electromagnetic field” as you might 
have read in popular books and introductory physics texts. The concept of a 
“photon” has a quite specific mathematically rigorous meaning: a single photon 
is an eigenvector of the electromagnetic Hamiltonian characterized by n = 1. 
Eigenvectors characterized by higher values of n describe ^-photon states. The states 
described by eigenvectors of the Hamiltonian are not the states in which the electric 
or magnetic field has any definite value. Moreover, the commutation relation, 
Eq. 7.93, and following from it uncertainty relation 7.94 indicate that there are no 
states in which electric and magnetic fields both have definite values. Moreover, in 
the states with fixed photon numbers, the expectation values of electric and magnetic 
fields are zeroes just like the expectation values of coordinate and momentum 
operators of the mechanical harmonic oscillator. At the same time, the expectation 
values of the squares of the fields are not zeroes, and these are the quantities which 
determine the energy of the fields. These are what we call vacuum fluctuations of 
electromagnetic field, where vacuum has, again, a very specific meaning—it is not 
just emptiness or a void; it is a state with zero photons, which is not the same as a 
state with zero field. 

The second issue which needs to be discussed in connection with vacuum energy 
is, again, the fact that a zero level of energy is always established arbitrarily. The 
vacuum energy, which we found, is counted from the (non-existent in quantum 
theory) state, in which both electric and magnetic fields are presumed to be zeroes. 
As long as the energy of the vacuum state does not change, while the phenomena 
we are interested in play out, we can set the vacuum energy to zero with no 
consequences for any physically significant results. To provide a counterexample to 
this statement, let me briefly describe a situation in which this assumption might not 
be true. If you consider the electromagnetic field between two conducting plates, 
the modes of the field and, therefore, its vacuum energy depend on the distance 
between the plates. This distance can be changed, in which case the vacuum energy 
also changes. Because of this capacity to change, it becomes relevant resulting in a 
tiny but observable attractive force acting between the plates known as the Casimir 
force. In most other situations, however, the vacuum energy is just a constant, whose 
value (finite or infinite) has no physical significance. 
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Thus, the eigenvectors of the electromagnetic Hamiltonian representing states 
with a definite number of photons, n , bear little resemblance to classical elec¬ 
tromagnetic waves just like stationary states of the harmonic oscillator have no 
relation to the motion of the classical pendulum. At the same time, in Sect. 7.1.2, 
I demonstrated that a generic nonstationary state reproduces oscillations of the 
expectation values of coordinate and momentum resembling those of their classical 
counterparts. While this result is true for a generic initial state, and the behavior of 
the expectation values to a large extent does not depend on their details, not all initial 
states are created equal. However, to notice the difference between them, we have to 
go beyond the expectation values and consider the uncertainties of both coordinate 
and momentum or, in the electromagnetic context, of electric and magnetic fields. 
The fact that different initial states result in different behavior of uncertainties 
has already been demonstrated in the examples presented in Sect. 7.1.2. However, 
out of all the multitude of various initial states, there exists one, for which these 
uncertainties are minimized in a sense that their product has the smallest allowed by 
the uncertainty principle value. In the electromagnetic case it means that the sign > 
in Eq. 7.94 is replaced with =. These states are called “coherent” states, and they 
are much more important in the electrodynamics rather than in mechanical context, 
so this is where I shall deal with them. 


7.3.2 Coherent States of the Electromagnetic Field 

The coherent states are defined as eigenvectors of the annihilation operator: 


b\ot) = a |a). 


(7.100) 


Since the annihilation operator is not Hermitian, you should not expect the 
eigenvalues to be real, and we do not know yet if they are continuous or discrete. 
I can, however, try to find the representation of vectors \a) in the basis of the 
eigenvectors | n) of the electromagnetic Hamiltonian: 


oo 



(7.101) 


where c n = (n \a). The Hermitian conjugation of Eq. 7.99 yields 



(7.102) 


so that I can find for the expansion coefficients 
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=4t <°i (*)" i«> = - 7 = <°i «> • 

Vn! v 7 v«! 

The only unknown quantity here is Co = (0| a), which I find by requiring that la) 
is normalized, which means that J2 n \ c n\ 2 = 1. Applying this last condition, I have 

oo I 1 2 n 

l c °| 2 53“T = ko| 2 exp(|«| 2 ) = 1 

n =0 

where I recalled that (x n //r!) is a power series expansion for the exponential 
function of v. Thus, choosing Co to be real-valued, I have the following final 
expression for the expansion coefficients: 


Cn 


= e 


ki3 

2 



(7.103) 


Equation 7.103 together with Eq. 7.101 completely defines a coherent state with 
eigenvalue a. Since the derivation of the eigenvector did not produce any restrictions 
on a, it must be presumed to be a continuous complex-valued variable. The vector 
that I found describes a state which is the superposition of states with different 
numbers of photons and, respectively, with different energies. Respectively, the 
number of photons in this case is a random quantity with a probability distribution 
given by 


Pn 


\c 


1 2 n 


n\ 


(7.104) 


Equation 7.104 describes a well-known probability distribution, called the Poisson 
distribution, which appears in a large number of physical and mathematical 
problems. This distribution describes the probability that n events will happen within 
some fixed interval (of time or of distances) provided that the probability of each 
event is independent of the occurrence of the others and all events are happening at a 
constant rate (probability per unit time or unit length or unit volume does not depend 
upon time or position). This distribution describes, for instance, the probability that 
n atoms will undergo radioactive decay within some time interval or the number 
of uniformly distributed non-interacting gas molecules that will be found occupying 
some volume in space. For more examples of the Poisson distribution, just google it. 
The entire Poisson distribution depends on a single parameter |a| 2 , whose physical 
meaning can be elucidated by computing the mean (or expectation value) of the 
number of photons h a in the state |a): 


00 1 1 2 n 

2 ^ la 


n a = 'Y2, n Pn = e |a| 


n =0 


n =0 
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n= 1 


(n- 1)! 


= ^“ |2 E 

k =0 


|a| 2(i+1) 

Id 


e _|a| lal 2 


E- 

k =0 


i2fc 


k\ 


= 


M * 


2 ^-i-i = ici 2 . 


where in the second line, I first took into account that the n = 0 term in the sum 
is multiplied by n = 0 and, therefore, does not contribute. Accordingly I started 
the sum with n — 1 , after which I introduced a new index k = n — 1 , which 
reset the counter back to zero. As a result, I gained an extra term |of| 2 , while the 
remaining sum became just an exponential function canceling out the normalization 
term e~' al . This calculation shows that \a\ has the meaning of the average 
number of photons in the state with eigenvalue a. It is also interesting to compute 

the uncertainty of the number of photons in this state An = 

y/{n 2 ) a — it*. First, I compute [n 2 ) a '. 



R =E 


oo | |2 n 

2 ^ n 2\^L =e -M 


n =0 


M 2 " 

V'kRT 


-M 2 


E 

k =0 


(k + l) \a\ 2(k+l) 


00 U,|2A: 

= e -W 2 


k\ 


: E 

k =0 


k\ 


+ 


OO J , , 2 k 

e -\a\ 2 fe l“l 


e a 


: E : 

k =0 


k\ 


= W\ 2 + \ot\\ 


where I used the same trick with the sum as above, twice. Now I can find that 
An = The relative uncertainty of the photon numbers A n/h a = 1/ an d 
becomes progressively smaller as the average number of photons increases. The 
decrease of the quantum fluctuations signifies transition to classical behavior, and 
one can suppose, therefore, that in the limit h a 1, the electric and magnetic fields 
in this state will reproduce behavior typical for a classical electromagnetic wave. 
To verify this assumption, I will compute the expectation values and uncertainties 
of the electric and magnetic fields for this state as well as will consider their time 
dependence. 

Reversing Eqs. 7.96 and 7.97, 1 find for the fields 


(7.105) 
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Taking squares of these expressions yields 

| (b 1 + b n + bP + tfb) = ^ (b 2 + b n + 2b'b + l) (7.107) 

£l = A ( 'b 2 + S t2 - W - Pb) = A ( 'b 2 + S t2 - - l) (7.108) 

where I changed the order of operators in bW using the commutation relation 
\b, = 1. Now I am ready to tackle both the expectation values and the 

uncertainties. The computation of the expectation values |So j and (So j is almost 

trivial: taking into account that (a\b\a) = a and (a| b^ la) = (a\b\a)* = a*, I 
have 


(Bo)= (« + **) (7.109) 

(^) =, V! (“*““)• (7 ‘ no) 


The expectation values of the squares of the fields take just a bit more work: 
before computing (a\b^b\a) 9 I first need to realize that the Hermitian conjugate 
of expression b\a) = a is (a| = a*. With this little insight, the rest of the 

computation is as trivial as that for the expectation values. The result is 


(4 2 )= *( a+a *) 2 + ^ (7.111) 

(4 2 )= (7.112) 

Finally, the uncertainties of both fields are found to be independent of a and equal to 


AB 0 = AS 0 = 

so that their product indeed is the smallest allowed by the uncertainty principle 
ABoASo = fi/2. Relative uncertainties ABo/ diminish with the increase in 

| a | = and vanish in the limit h a —> oo, which obviously corresponds to 
the classical (no quantum fluctuations) limit. This result provides an additional 
reinforcement to the idea that the electromagnetic field in the coherent states is as 
close to a classical wave as possible. 

Finally, I will consider how these quantities (the expectation values and uncer¬ 
tainties) change with time. The easiest way to do this is to use the Heisenberg 
picture, in which all dynamics is given by the time dependence of the annihilation 
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operator, which as we know from the consideration of harmonic oscillator is very 
simple bn(t) = Z?exp {—icot), so that (a\bn\(x) = a exp {—icot). With this I 
immediately find for the field expectation values 


and for their squares 


(flow) = /| (<xe- iwt + a*e iM ) 

(7.113) 

(£>«) = *'/ ^(a*e iM -ae~ iM ) 

(7.114) 

[Bl(t)) = ^ (ae- ml + aV-f + \ 

(7.115) 

[i 2 0 (t))= 

(7.116) 


It is remarkable that the uncertainties of the fields 


[Blit)) - [B 0 (t)) 2 , (4 2 (o) - 



remain time independent and satisfy the minimal form of the uncertainty 


principle at all times. While the harmonic time dependence of the expectation value 
is typical for almost any initial state, the uncovered behavior of the uncertainties is 
the special property of the coherent states and is what makes them so special. This 
also guarantees that the shape of the coherent superposition of the stationary states 
does not get distorted with time similar to what one would expect from a classical 
electromagnetic wave. 


7.4 Problems 
Problems for Sect. 7.1 

Problem 83 Using Eq. 7.16 together with Eqs. 7.12 and 7.13, find the time depen¬ 
dence of the coordinate x and momentum p. Comparing the found result with 
Eq. 7.9, find the relation between parameters bo, b\ and xo,po- 

Problem 84 Verify that Eqs. 7.14 and 7.15 are equivalent to the Hamiltonian equa¬ 
tions for the regular coordinate and momentum by computing the time derivatives 
of the variables b and b> using Eqs. 7.12 and 7.13 together with Eqs. 7.7 and 7.8. 

Problem 85 Prove that a \n) = Jn \n—l). 
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Problem 86 Suppose that a harmonic oscillator is at t = 0 in the state described by 
the following superposition: 

|a 0 > =a(V2|0} + V3|l>). 


1. Normalize the state. 

2. Find a vector \a(t)) representing the state of the oscillator at an arbitrary time t. 

3. Calculate the uncertainties of the coordinate and momentum operators in this 
state, and check that the uncertainty relation is fulfilled at all times. 

Problem 87 Using the method of mathematical induction, prove that 



(—l)"exp 



d n exp (—y 2 ) 
dy n 


and derive Eq. 7.44 for the coordinate representation of an eigenvector of the 
Hamiltonian of the harmonic oscillator. 

Problem 88 Using matrices a mn = (m\a\n) \a} mn — (m\cft \n), demonstrate by 
direct matrix multiplication that 


(^)mn = 'Yh a mk a kn = m8, nn . 

k 


Problem 89 Using the coordinate representation of the lowering operator a , apply 
it to the coordinate representation of the n = 3 stationary state of the harmonic 
oscillator. Is the result normalized? If not, normalize it and compare the found 
normalization factor with Eq. 7.40. 

Problem 90 Using lowering and raising operators, compute the expectation value 
of the kinetic, K , and potential, V, energies of a harmonic oscillator in an arbitrary 
stationary state | n). Check that 


This result is a particular case of the so-called virial theorem relating the expectation 
values of kinetic and potential energies of a particle in the potential described by 
V = kx p . The general form of the theorem is which for p = 2 

(harmonic oscillator) is reduced to the result of this problem. 

Problem 91 Derive explicit expressions for Hermite polynomials with n = 3,4,5 
(of course, you can always google it, but do it by yourselves—you can learn 
something), and demonstrate explicitly that they obey the orthogonality relation: 

oo 

/ exp (—x 2 ) H m (x)H n (x)dx = 0, m ^ n. 
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Problem 92 

1. Find eigenvectors la) of the lowering operator a: a \a) = a \a) in the coordinate 
representation. Normalize them. 

2. Show that the raising operator eft does not have normalizable eigenvectors. 

Problem 93 Compute a probability that a measurement of the coordinate will yield 
the value in the classically forbidden region for the oscillator prepared in each of the 
following stationary states: |0), |1), and |3). (Note that the boundary of classically 
allowed regions is different for each of these states.) 

Problem 94 Consider an electron with mass m e and charge in a harmonic 
potential V = m e o) 2 x 2 /2 also subjected to a uniform electric field £ in the positive 
x direction. 

1. Write down the Hamiltonian for this system. 

2. Using operator identities from Sect. 3.2.2, prove that 


exp 



— x 4- d, 


(7.117) 


where 5c and p x are regular operators of the coordinate and the respective 
components of the momentum and d is a real number. 

3. In Sect. 5.1.2 I already demonstrated using the example of a parity operator that 


if two vectors are related to each other as |/3) = T\a), while vectors 


and 


a) are defined as = U\ff), |<5) = U\a), one can show that ^ 


= T IS) 


where T' — UTU~ l . Use this relation together with Eq. 7.117 to reduce the 
Hamiltonian found in Part I of this problem to that of a harmonic oscillator 
without the electric field, and express the eigenvectors of the Hamiltonian with 
the field (perturbed Hamiltonian) in terms of the eigenvectors of the Hamiltonian 
without the field (unperturbed). 

4. Write down the coordinate wave function representing the states of the perturbed 
Hamiltonian in terms of the wave functions representing the states of the 
unperturbed Hamiltonian. Comment on the results. Explain how it can be derived 
by manipulating the classical Hamiltonian before its quantization. 

5. If the electron is in its ground state before the electric field is turned on, 
find the probability that the electron will be found in the ground state of the 
Hamiltonian with the electric field on. (Hint: You will need to use operator 
identities concerning with the exponential function of the sum of the operators 
discussed in Sect. 3.2.2 and the representation of the momentum operator in 
terms of raising and lowering operators. Remember: The exponential function 
of an operator is defined as a corresponding power series.) 
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Problems for Sect. 7.1.2 


Problem 95 Using the Heisenberg representation, find the uncertainty of coordi¬ 
nate and momentum operators at an arbitrary time t for the state 

|a> = -h(|l> + |2) + |3» ( 

where | n) is nth stationary state of the harmonic oscillator. Verify that the uncertainty 

relation is fulfilled at all times. 

Problem 96 Solve the previous problem using the Schrodinger representation. 

Problem 97 Consider the system described in Problem 94, but work now in the 

Heisenberg picture. 

1. Write down the Hamiltonian of the electron in the Heisenberg picture. 

2. Write down the Heisenberg equations for lowering and raising operators and 
solve them. 

3. Now, assume that the electric field was turned on at t = 0, when the electron 
was in the ground state |0) of the unperturbed Hamiltonian, and turned back off 
at t = tf. In the Heisenberg picture, the state of the system does not change, so 
that all time evolution is described by the operators. Let us call the lowering and 
raising operators at t = 0 a in , a] n (these are, obviously, the same operators that 
appear as initial conditions in the solutions of the Heisenberg equations found in 
Part I of the problem). These operators are just lowering and raising operators in 
the Schrodinger picture, so that the initial state obeys equation a in |0) =0. In the 
Heisenberg picture, raising and lowering operators change with time according 
to the expressions found in Part I. Considering these expressions at t = tf , you 
will find af = a (tf ), and aj = cfi (tf ). Verify that these operators have the same 
commutation relation as their Schrodinger counterparts. 

4. The time evolution of the Hamiltonian, which at all times has the form found in 
Part I, is completely described by the time dependence of lowering and raising 
operators. Using the expressions for af and aj- found in the previous part of the 
problem, write down the Hamiltonian of the electron at times t > tf in terms of 
operators a in , aj n . 

5. Using the found expression for the Hamiltonian, find the expectation value of 
energy in the given initial state. 

6. The Hamiltonian of the electron at t > tf has the same form in terms of operators 
af, aj, as the Hamiltonian for t < to has in terms of operators , a] n . Also, it 
has been shown in Part III that af, aj have the same commutation relations as 

a in , aj n . This means that Hamiltonian t > tf has the same eigenvalues, and its 
eigenvectors satisfy the same relations: 
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cif 10)/ = 0 

\n) f = -j=% 10), 

where the first equation defines the new vacuum state |0)y and the second 

equation defines new eigenvectors. Since operators Of, aj differ from a in , aj n , 
the new ground state and the new eigenvectors will be different from those 
of the initial Hamiltonian. Using the representation of aif in terms of a in , find 
the probability that if the system started out in the ground state of the initial 
Hamiltonian, it will be found in the new ground state |0)y. 


Problems for Sect. 7.2 

Problem 98 Verify Eqs. 7.71 and 7.72. 

Problem 99 Rewrite the Schrodinger equation for the stationary states of a 3-D 
isotropic harmonic oscillator in cylindrical coordinates, p,cp,z- Show that the wave 
function can be written down ^ni,n 2 ,m = (z)R n2 (p) exp(zm^), and derive 
equations for functions Z ri] ( z ) and R n2 (p )• The first of these equations will coincide 
with the Schrodinger equation for a one-dimensional harmonic oscillator, so you 
can use the results of Sect. 7.1.1 to determine this function and the corresponding 
contribution to energy, but the equation for R n2 (p ) will have to be solved from 
scratch. Do it using the power series method developed in the text for the spherical 
coordinates. 

Problem 100 You just saw that the wave functions of an isotropic oscillator 
can be presented using Cartesian, spherical, and cylindrical coordinates. While 
each of these functions, corresponding to the same degenerate energy value, has 
very different forms, since all of them represent the same eigenvectors belonging 
to the corresponding eigenvalue, you shall be able to present each of them as 
linear combinations of the others belonging to the same eigenvalue. Verify that 
this is indeed the case for states belonging to energy value E = 5tico/2 by 
explicitly expressing wave functions written in Cartesian coordinates in terms of 
their spherical and cylindrical coordinate counterparts. 


Problems for Sect. 7.3.2 

Problem 101 Verify Eqs. 7.115 and 7.116 for time-dependent expectation values 
of the squares of electric and magnetic fields. 
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Problem 102 The flow of the energy of the electromagnetic field is described by 
the Poynting vector, which in SI units is given by 


S = —E x B. 

Mo 


In our toy model of the electromagnetic field, the Poynting vector becomes simply 


1 

S = — E X By. 
Mo 


In quantum theory, the Poynting vector becomes an operator. Find the time- 
dependent expectation value and uncertainty of this operator in the coherent state. 



Chapter 8 

Hydrogen Atom 


Check for 
updates 




8.1 Transition to a One-Body Problem 


Quantum mechanics of the atom of hydrogen, understood as a system consisting of 
a positively charged nucleus and a single negatively charged electron, is remarkable 
in many respects. This is one of the very few exactly solvable three-dimensional 
models with realistic interaction potential. As such it provides the foundation for 
much of our qualitative as well as quantitative understanding of optical properties 
of atoms at least as a first approximation for more complicated situations. A similar 
model also arises in the physics of semiconductors, where bound states of negative 
and positive charges form entities known as excitons, as well as in the situations 
involving a single conductance electron interacting with a charged impurity. Another 
curious property of this model is that the energy eigenvalues emerging from the 
exact solution of the Schrodinger equation coincide with energy levels predicted by 
the heuristic Bohr model based on a rather arbitrary combination of Newton’s laws 
with a simple quantization rule for the angular momentum. While it might seem as a 
pure coincidence of a limited significance given that by now we have harnessed the 
full power of quantum theory and do not really need Bohr’s quantization rules, one 
still might wonder by how much the development of quantum physics would have 
been delayed if it were not for this “coincidence.” 

I will begin exploration of this model with a brief reminder of how classical 
mechanics deals with the problem. There are two different aspects to it which need 
to be addressed. First, unlike all previous models considered so far, which involved 
a single particle, this is a two-body problem. Luckily for us, this problem only 
pretends to be two-body and can be easily reduced to two single-particle problems. 
This is how it is done in classical physics. The classical Hamiltonian of the problem 
has the following form: 



( 8 . 1 ) 
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where p x r x and p 2 ,r 2 are momentums and positions of the particles with corre¬ 
sponding masses m e and m p and V (\r x — r 2 1) is the Coulomb potential energy, which 
in the SI units can be written as 


V(\ri -r 2 1) = 


1 Z<? 2 

4 7TS r £ 0 |ri — r 2 |' 


( 8 . 2 ) 


Here e is the elementary charge, Z is the atomic number of the nucleus introduced 
to allow dealing with heavier hydrogen-like atoms such as atoms of alkali metals 
(or a charged impurity), and s r is the relative dielectric permittivity accounting for a 
possibility that the interacting particles are inside a dielectric medium. To separate 
this problem into two single-particle problems, I introduce new coordinates: 


m p r x + m e r 2 

K = - 

m p + m e 


(8.3) 


r = r i —r 2 . 


(8.4) 


I hope you have recognized in R a coordinate of the center of mass of two particles 
and in r their relative position vector. Now, I need to find the new momentums 
associated with these coordinates. For your sake I will avoid using formalism of 
canonical transformations in Hamiltonian mechanics and will begin with defining 
the kinetic energy in terms of respective velocities. Reversing Eqs. 8.3 and 8.4, 1 get 


r x =R + 


r 2 = R — 


m P 


m p + m e 
m p 

m p + m e 


-r, 


so that the kinetic energy can be found as 



Introducing two new masses—total mass of the system M = m p + m e and reduced 
mass \i — m p m e / (m p + m e )—I can define the momentum of the center of mass: 


dR 


P * = M 7T 











8.1 Transition to a One-Body Problem 


257 


and relative momentum 



so that the Hamiltonian, Eq. 8.1, can be rewritten as 


H 


+ — + V(r). 
2M 2p w 


The corresponding Hamiltonian equations are separated into a pair of equations for 
the position and momentum of the center of mass: 


P_R 
M 

0 , 

and for the relative motion 

dr_ = P L 
dt M 
dp r dV 

dt dr 

The first pair of these equations describes a uniform motion of a free particle—the 
center of mass of the system—while the second pair describes the motion of a single 
particle in potential V(r). 

I have little doubts that variables r and p r form a canonically conjugated pair, and 
so I can transition to the quantum description by promoting them to operators with 
standard commutation relation [/y,/? r7 ] = ifiStj. However, in order to be 100% sure 
and convince all the possible skeptics, I do need to verify this fact by computing 
Poisson brackets with these variables. To this end I need to express r and p r in terms 
of initial coordinates and momentums. Expression for r is given by Eq. 8.4, so I only 
need to figure out p r \ 


dR 

dt 

d Pr< 

dt 


m e m„ (dr i dr 2 \ m e m p 

Pr = --- “7T - ~T = -T- Pi -T- Pi ( 8 -5) 

m e + m p \ dt dt J m e + m p m e + m p 

Let me focus for concreteness on v-components of the momentum and coordinate. 
Equation 3.4 for the Poisson bracket, where summation must include the sum over 
coordinates of both particles, yields 


{x,Prx} 


9x dprx dx fan 

dx\ dp i x dx 2 dp 2x 


m e 


m„ 


m e + m p m e + m p 


= 1 


as expected. All other Poisson brackets also predictably produce necessary results, 
so you can start breathing again. 
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8.2 Eigenvalues and Eigenvectors 

It is important that the portion of the Hamiltonian describing the motion of the center 
of mass, H r = p 2 R /2M, is completely independent from the part responsible for the 
relative motion 


D 

+ V(r) (8.6) 

2/z 

so that the eigenvectors of the total Hamiltonian H = Hr + H r can be written 
down as \/r) \Xr), where the first vector is an eigenvector of Hr with eigenvalue E R , 
while the second vector is the eigenvector of H r with its own eigenvalue E r . The 
eigenvalue of the total Hamiltonian is easily verified to be E R + E r (when verifying 
this statement, remember that Hr acts only on \xr), while H r only affects |/ r )). I 
am going to ignore the center of mass motion and will focus on the Hamiltonian 
H r , Eq. 8.6, with the Coulomb potential energy, Eq. 8.2. In what follows I will omit 
subindex r in the Hamiltonian. 

What I am dealing with here is yet another example of a particle moving in a 
central potential, similar to the isotropic harmonic oscillator problem considered in 
Sect. 7.2. Just like in the case of a harmonic oscillator, Hamiltonian 8.6 commutes 
with angular momentum operators L 2 and L z \ thus its eigenvectors are also 
eigenvectors of the angular momentum. Working in the position representation and 
using spherical coordinates to represent the position, I can again write down for the 
wave function 


fn,i,m (r, 0, (p) = Y™ (6, (p) R n i{r) 


where F z m ( 6 , cp) are spherical harmonics—coordinate representation of the eigen¬ 
vectors of angular momentum operators. The equation for the remaining radial 
function R n i (r) is derived in exactly the same way as in Sect. 7.2 and takes the 
form similar to Eq. 7.71: 


ft 2 d f 2 dR nr ,i \ fi 2 l(l+ 1 ) 

Ifir 2 dr y dr ) 2/zr 2 Ur,l 


1 Ze 2 

- - R, 

47T£ r £o r 


— El,n r Rn r ,l 


(8.7) 


with obvious replacements of m e —> /i and quadratic harmonic oscillator potential 
for the Coulomb potential. Eigenvalues of energy E^ n are found by looking for 
normalizable solutions to this equation. My choice of the indexes to label the 
eigenvalues reflects the fact that the eigenvalues of the Hamiltonian with any central 
potential do not depend on m. Indeed, quantum number m is defined with respect 
to a particular choice of the polar axis Z, but since the energy of a system with a 
central potential cannot depend upon an arbitrary axis choice, it should not depend 
on this quantum number. Here is another example of how symmetry consideration 
helps to analyze the problem. 
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I will begin by reducing Eq. 8.7 to a dimensionless form as it is customary in 
this type of situations. What I need for this is a characteristic length scale, which in 
this problem, unlike the harmonic oscillator case, is not that obvious. But there is a 
trick which I can use to find it, and I am going to share it with you. Just by looking 
at the radial equation, I know that there are three main parameters: mass /x, charge 
e , and Planck’s constant fi in this problem, and I need to find their combination 
with the dimension of length. This is done by first writing down this combination 
in the most generic form as fi a e^fi y , where e = el ^/4jt£o£ r is the combination of 
charge, vacuum, and relative permittivity, So and s r correspondingly appearing in the 
Coulomb law in SI units, while a, /3, and y are unknown powers to be determined. 
In the next step, I will present the dimension of each factor in this expression in 
terms of the basic quantities: length, time, and mass. For instance, the dimension 
of e can be found from the Coulomb law as [e] = [ F ]^ 2 [L] , where [F\ stands for 
the dimension of force and [L\ stands for the dimension of length. The dimension of 
force in basic quantities is [F] = [M\ [L] [ T]~ 2 , where [M] represents the dimension 
of mass and [T] represents the dimension of time (think of Newton’s second law). 
So, for the effective charge, I have [e] = [M] 1//2 [L] 3 ^ 2 [T] _1 . The dimension of 
Planck’s constant can be determined from the Einstein-de Broglie relation between 
energy and frequency as [fi] = [E] [T] = [M] [L] 2 [T]~ l , where in the second step I 
expressed the dimension of energy [E] as [E] = [F] [L\. Combining the results for 
the charge and Planck’s constant, I find 

H tt ePt? = [Mf [M] fi/2 [L] 3fi/2 [T\~ p [M] y [L] 2y [T]~ y = 

[M] a+ ^ 2+y [L] 3 ^/ 2+2y [T]~P~ y 

If I want this expression to have the dimension of length [L] , I need to eliminate the 
excessive dimensions such as [M] and [T]. Remembering that any quantity raised to 
the power of zero turns to unity and becomes dimensionless, I can eliminate [M\ and 
[T] requiring that their corresponding powers vanish: 

a + /3/2 + y = 0 
P + y = 0. 

Then all what is left to do is to make the power of E equal to unity: 

3/3/2 + 2y = 1. 

The result is the system of equations for unknown powers, solving which I find 
y = 2, /3 = —2, and a = — 1, i.e., the characteristic length scale can be constructed 
using the parameters at our disposal as 


4tT £{)£ r fl 2 


( 8 . 8 ) 


as — 
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The found characteristic length is actually well known from Bohr’s theory of atomic 
spectra and is called Bohr radius. It can be used to introduce a dimensionless 
coordinate g = r/a B and rewrite Eq. 8.7 as 


]_d_f 2 dR n r A 
g 2 dg y dg ) 


+ 


/(/ +1) 2Z 2 (4jt£ 0 £ r y h 


2*2 


9 Rn r ,l Rn r ,l — 

S' S’ 


e 4 fi 


~El f n r Rn r ,l' 


(8.9) 


You can verify (do it yourselves) that quantity 


E = 


4 

e*fjL 


32 7Z 2 S^£ 2 tl 2 


( 8 . 10 ) 


has the dimension of energy, so that I can present the right-hand side of this equation 
in terms of dimensionless energy parameter 


€l,n 


El,n/E. 


Finally, introducing auxiliary radial function u n j = gR n ,i (the same as in the 
harmonic oscillator problem), I obtain the effective one-dimensional Schrodinger 
equation similar to Eq. 7.73: 


(fiunrl 

dg 2 


1(1 + 1) 


2Z 

Un r ,l Mn r J — ^l,n r ^n r ,l- 


( 8 . 11 ) 


The effective potential in Eq. 8.11 is positively infinite at small g, but as g increases, 
it, unlike the harmonic oscillator problem, becomes negative, reaches a minimum 
value of — Z 2 / [/(/ + 1)] at g = /(/ + 1)/Z, and remains negative while approaching 
zero for g —> oo; see Fig. 8.1. Classical behavior in such a potential is bound for 
negative values of energy and unbound for positive energies. In the former case, 
we are dealing with a particle moving along a closed elliptical orbit, while in the 


Fig. 8.1 Dimensionless 
effective potential as a 
function of dimensionless 
radial coordinate 



- 15-1 
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latter case, the situation is better described in terms of scattering of a particle by 
the potential. In quantum description, as usual, we shall expect states characterized 
by the discrete spectrum of eigenvalues for classically bound motion (negative 
energies) and states with continuous spectrum for positive energies. The wave 
functions representing states of continuum spectrum are well known but rather 
complex mathematically and are not used too frequently, so I shall avoid dealing 
with them for the sake of keeping everyone sane. The range of negative energies is 
much more important for understanding the physical processes in atoms and is more 
tractable. In terms of atomic physics, the states with negative energies correspond 
to intact atoms, where the electron is bound to its nucleus, and the probability that it 
will turn up infinitely far from the nucleus is zero. The states of continuous spectrum 
correspond to ionized atoms, where energy of the electron is too large for the nucleus 
to be able to “catch” it so that the electron can be found at an arbitrary large distances 
from the nucleus. 

The process of finding the solution of Eq. 8.11 follows the same steps as solving 
a similar equation for the harmonic oscillator: find an asymptotic behavior at small 
and large g, factor it out, and present the residual function as a power series. I, 
however, can simplify the form of the equation a bit more by replacing variable g 
with a new variable p = g ^/—€i, n (remember €^ nr < 0 !). Equation 8.11 now takes 
the following form: 


CpUnrJ 
dp 2 


1 ( 1 + 1 ) 

P 2 


i Vn r _ f . 

Un r ,l \ Un r J Un r ,l — ^ 

P 


where I introduced a new parameter 


K l,n r 


2 Z 

>/ ^ l,n r 


( 8 . 12 ) 


The asymptotic behavior at low p is determined by the contribution from the angular 
momentum and is the same as for the harmonic oscillator, u nr j oc p /+1 , but the large 
p limit is now determined by u Ur j term. The resulting equation 


d 2 u„ r ,i 

~W~ Un -l 


has two obvious solutions 


U n , j oc exp (±p) 

of which I will only keep an exponentially decreasing one in hopes to end up with a 
normalizable solution. Thus, I am looking for the solution in the form 


Un r ,l = p ,+ 1 exp (-p) v nJ (p), 


(8.13) 
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where a differential equation for the reduced function v n j is derived by substituting 
Eq. 8.13 into Eq. 8.11. The rest of the procedure is quite similar to the one I outlined 
in the harmonic oscillator problem: present v n j as a power series with respect 
to p, derive recursion relation for the coefficients of the expansion, verify that 
the asymptotic behavior of resulting power series yields a non-normalizable wave 
function, restore normalizability by requiring that the power series terminates after 
a final number of terms, and obtain an equation for the energy values consistent 
with the normalizability requirement. Leaving details of this analysis to the readers 
as an exercise (of course, you can always cheat by looking it up in a number of 
other textbooks, but you will gain so much more in terms of your technical prowess 
and self-respect by doing it yourselves!), I will present the result. The only way to 
ensure normalizability of the resulting wave functions is to require that parameter 
Ki nr satisfies the following condition: 


^l,n r — ^(jmax + / + 1) 


(8.14) 


where j max is the number of the largest non-zero coefficient in the power series 
expansion of function 



j 


and takes arbitrary integer values starting from 0. However, since j max appears in 
Eq. 8.14 only in combination with /, the actual allowed values of /C/^ r depend on 
a single parameter, called a principal quantum number n , which takes any integer 
values starting from n = 1. Independence of the energy eigenvalues of a hydrogen 
Hamiltonian of the angular momentum number / is a peculiarity of the Coulomb 
potential and reflects an additional symmetry present in this problem. In classical 
mechanics this symmetry manifests itself via the existence of a supplemental 
(to energy and angular momentum) conserving quantity called Laplace-Runge- 
Lenz vector 


A = p x L + p- - 

4Tt£ r £Q 


where e r is a unit vector in the radial direction. In quantum theory this vector can be 
promoted to a Hermitian operator, but this procedure is not trivial because operators 
p and L do not commute. However, I am afraid that if I continue talking about the 
quantum version of Laplace-Runge-Lenz vector, I might open myself to a lawsuit 
for inflicting cruel and unusual punishment on the readers, so I will restrain myself. 
Those who are not afraid may look it up, but quantum treatment of Laplace-Runge- 
Lenz vector is not very common even in the wild prairies of the Internet. 

Anyway, I can drop now the double-index notation in k and dimensionless energy 
€ and classify the latter with a single index—principal quantum number n. Taking 
into account Eq. 8.14 and introducing 


ft — jmax + / + 1, 


(8.15) 



8.2 Eigenvalues and Eigenvectors 


263 


I find for the allowed energy values 


E n — Ea n 


Z 2 ~ _ ZV/x 1 

n 2 32; x 2 s 2 s\h 2 n 2 



(8.16) 


where I introduced a separate notation E g for the ground state energy. It is also 
useful sometimes to have this expression written in terms of the Bohr radius a B 
defined by Eq. 8.8: 


Z 2 e 2 1 

E n = - - (8 ' 17) 

8Tt£ r £oClB n z 

For pure hydrogen atom in vacuum Z = 1, £ r = 1, and taking into account that 
the ratio of the mass of a proton (the nucleus of a hydrogen atom is a single proton) 
to the mass of the electron is approximately m p /m e ^ 1.8 x 10 4 , I can replace 
the reduced mass \i with the electron mass. In this case, the numerical coefficient 
in front of \/n 2 contains only universal constants and can be computed once and 
for all. It defines the so-called Rydberg unit of energy 1 Ry, which in electron volts 
is approximately equal to 13.6eV. This is one of those numbers which is actually 
worth remembering, just like a few first digits of number tt . The physical meaning of 
this number has several interpretations. First of all, it is the ground state energy of the 
hydrogen atom, but taking into account that transition from the discrete energy levels 
to the continuous spectrum (ionization of atom) amounts to raising of the energy 
above zero, you can also interpret this value as the binding energy or ionization 
energy of a hydrogen atom—a work required to change the electron’s energy from 
the ground state to zero. Also, this number fixes the scale of atomic energies in 
general. Transition to atoms heavier than hydrogen, which are characterized by 
larger atomic numbers, makes the ground state energy more negative increasing 
the binding energy of the atom, which of course totally makes sense (atoms with 
larger charge attract electrons stronger). 

If you apply Eq. 8.16 to excitons in semiconductors, it will yield a very different 
energy scale. It happens for several reasons. First, masses of the interacting positive 
and negative charges forming an exciton are comparable in magnitude, so one does 
need to compute the reduced mass. Second, these masses are often by an order of 
magnitude smaller than the mass of the free electron, which results in significant 
decrease of the binding energy. This decrease is further enhanced by a relatively 
large dielectric constant of semiconductors £ r . All these factors taken together 
result in a much larger ground state energy of excitons (remember the energy is 
negative!) with much smaller ionization or binding energy, which varies across 
different semiconductors and can take values from of the order of 10 -3 to 10 -2 eV. 

Finally, I would like to point out the fact that the discrete energy levels of 
hydrogen-like atoms occupy the final spectral region between the ground state and 
zero. Despite this fact, the number of these levels is infinite, unlike, for instance, in 
the case of one-dimensional square potential well. This means that with increasing 
principal quantum number n , the separation between the adjacent levels becomes 
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smaller and smaller, and at some point the discreteness of the energy becomes 
unrecognizable, even though the probability to observe the electron infinitely far 
from the nucleus is still zero. One can think about this phenomenon as approaching a 
classical limit, in which an electron’s motion is finite, it is still bound to the nucleus, 
but quantum effects become negligibly small. 

For each value of n , there are several combinations of j max and / satisfying 
Eq. 8.15, which means that all energy levels, except of the ground state, are 
degenerate with several wave functions belonging to the same energy eigenvalue 
that differ from each other by values of / and m. The total degree of degeneracy is 
easy to compute taking into account that for each n , there are n — 1 values of l (/ 
obeys obvious inequality l < n ), and for each /, there are 2/ + 1 possible values of 
m. The total number of wave functions corresponding to the same value of energy 
is, therefore, given by 





(8.18) 


As expected, this formula yields a single state for n = 1 for which j max = 0 and 
l = m = 0. For the energy level with n = 2, Eq. 8.18 predicts the existence of four 
states, which we easily recognize as one state l = m = 0 (in which case j max = 1) 
and three more characterized by / = 1, m = — 1; / = 1, m = 0; and / = l, m = 1. 
For all of them, the maximum power of the polynomial function in the solution is 
j max = 0. For some murky and not very important historical reasons, / = 0 states 
are called s-states, / = 1 are called ^-states, / = 2 are d- states, and, finally, letter/ 
is reserved for / = 3 states. The origin of these nomenclature comes from German 
words for various types of optical spectra associated with each of these states, but 
I am not going into this issue any further. Those who are interested are welcome to 
google it. 

Replacing all dimensionless variables with physical radial coordinate r = 
pna B /Z, I find that the radial wave function R n / = u n j/r with fixed values of n and 
/ is a product of (. Zr/na B ) l , exponential function exp (— Zr/na B ), and a polynomial 
v n j ( Zr / na B ) of the order n — l — 1. The polynomials, which emerge in this problem, 
are well known in mathematical physics as associated Laguerre polynomials defined 
by two indexes as L p q _ p (x). The definition of these polynomials can be found in many 
textbooks as well as online, but for your convenience, I will provide it here as well. 
The definition is somewhat cumbersome and involves an additional polynomial 
called simply a Laguerre polynomial (no associate here): 



To define the associated Laguerre polynomial, one needs to carry out some 
additional differentiation of the simply Laguerre polynomial: 
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LP q _ p {x) = L q {x). (8.20) 

It is quite obvious that index q in Eq. 8.19 specifies the degree of the respective 
polynomial (exponential functions obviously cancel each other after differentiation 
is performed). At the same time, index q — p in Eq. 8.20 specifies the ultimate 
degree of the associated polynomial (differentiating p times a polynomial of degree 
q will reduce it exactly by this amount). So, in terms of these functions, the 
polynomial appearing in the hydrogen model can be written as v n j(Zr/na B ) = 
L^ij (2Zr/na B ). The total normalized radial wave function R n ,i(r ) can be shown 
to be 


R n ,i, (r) 



(n — l—l)! 
In (n + /)! 


exp 


(Zr\ (2Zr\ l L 2 l+i_ f^r\ 

V na B ) \na B ) " 1 1 \na B ) ' 

( 8 . 21 ) 


I surely hope you are impressed by the complexity of this expression and can 
appreciate the amount of labor that went into finding the normalization coefficient 
here, which you are given as a gift. I also have to warn you that different authors may 
use different definitions of the Laguerre polynomials, which affect the appearance 
of Eq. 8.21. More specifically, one might include an extra factor 1 /{n + /)! either 
in the definition of the polynomial or in the normalization factor. Equation 8.21 
is written according to the former convention, while if the latter is accepted, the 
term (n + /)! must be replaced with [( n + /)!] 3 . You might find both versions of 
the hydrogen wave function on the Internet or in the literature, and my choice was 
completely determined by the convention adapted by the popular computational 
platform MATHEMATIC A ©, which I am using a lot to perform computations 
needed for this book. The total hydrogen wave function, which in the abstract 
notation can be presented as \n, /, m), is obtained by multiplying the radial function 
and the spherical harmonics Y^ m (0, (p)\ 

(r, 9 , <p) = R nJ , (r) Y Um ( 6 , <p) . (8.22) 


Different factors in Eqs. 8.22 and 8.21 are responsible for different physical 
effects; however, before giving them any useful interpretation, I have to remind you 
that the respective probability distribution density is given by 


P (r, 9) = |-fnxm ( r,9,(p )\ 2 r 2 sin 9 = 


(—Y ( "7'- i)! .xpf-—V—yVx 

\na B J 2n [(«+!)!] \ na B ) \na B ) 


T 2/+1 

^ n-l-l 


(-) 

\na B J 


[p? (cos 9)] 2 sin 9 


(8.23) 
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where I replaced spherical harmonics by the product of associated Legendre 
functions Pf (cos 9) and exp (irrup) and took into account that the latter disappears 
after multiplication by the respective complex-conjugated expression. Additional 
factors r 2 and sin 9 are due to spherical volume element, which has the form of 
dV = r 2 sin OdOdcpdr. 

The exponential factor describes how fast the wave function decreases at infinity. 
The respective characteristic scale 



(8.24) 


can be interpreted (quite loosely though) as a size of the atom, because the 
probability to find the electron at distance r r at becomes exponentially small. 
In the case of the atom in the ground state (n = 1 ), it is easy to show (i.e., 
if you remember that a maximum of a function is given by a zero of its first 
derivative) that the distance r = r at corresponds to the maximum of the probability 
P (r) = P(r , 6)d0. In the case of a hydrogen atom (Z = 1), r at = as, and if this 
atom is in vacuum (e r = 1), the Bohr radius is determined by fundamental constants 
only. Replacing the reduced mass with the mass of the electron, you can find that 
the size of the hydrogen atom in the ground state is as ~ 0.5 x 10 _10 m. This 
number sets up the atomic spatial scale just as 13.6 eV sets up the typical energy 
scale. In the case of excitons in semiconductors, the characteristic scale becomes 
much larger for the same reasons why the energy scale becomes smaller: large 
dielectric constant and smaller masses yield larger a#; see Eq. 8.8. As a result the 
typical size of the exciton can be as large as 10 -8 m, which is extremely important 
for semiconductor physics, as it allows significant simplification of the quantum 
description of excitons. 

Radial distribution for higher lying states can have several maximums, so such 
a direct interpretation of r at becomes impossible, but it still can be thought of as 
a cutoff distance, starting from which the probability for the electron to wander 
off dramatically decreases. It is interesting that this parameter increases with n , so 
excited atoms not only have more energy, but they are also larger in size. Figure 8.2 
presents a number of radial functions for your perusal, which illustrates most of the 
properties discussed here. 

The factors containing the power of the radial coordinate are responsible for 
electrons not falling on the nucleus—the probability that r = 0 is strictly zero. 
This probability for r < r at decreases with increasing angular momentum number 
/, which can be interpreted as a manifestation of the “centrifugal” force keeping 
rotating particles away from the center of rotation. The Laguerre polynomial 
factor is essentially responsible for the behavior of the radial wave function at the 
intermediate distances between zero and r at \ the degree of the respective polynomial 
determines how many zeroes the radial wave function has. Finally, the Legendre 
function Pf (cos 9) is responsible for directionality of the probability distribution 
with respect to Z-axis. States with zero angular momentum are described by a 
completely isotropic wave function, which does not depend on direction at all. For 
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Fig. 8.2 A few radial functions with different values of principal and orbital numbers n and /. 
All graphs on the left figure correspond to l = 0 and increasing n between 1 and 3. The number 
of zeroes of the functions is equal to n — l — 1. The graphs in the right figure correspond to 
n = 3, / = 1; n = 3, 1 = 2; and n = 4, l = 1 (which one is which you can figure out yourselves 
by counting zeroes). The functions are not normalized for convenience of display 


states with non-zero /, an important parameter is l — m, which yields the number of 
zeroes of the Legendre function and can also be used to determine the number of the 
respective maximums. The properties of the Legendre functions have been already 
discussed in Sect. 5.1.4, and the plots illustrating them were presented in Fig. 5.3, 
which you might want to consult to refresh your memory. 


8.3 Virial and Feynman-Hellmann Theorems and 
Expectation Values of the Radial Coordinate 
in a Hydrogen Atom 


I will finish the chapter by discussing one apparently very special and technical but 
at the same time practically very important problem of calculating the expectation 
values ( r p ) of various powers of the radial coordinate r p , where p can be any negative 
or positive integer, in the stationary states of a hydrogen atom. Formally, calculation 
of these expectation values involves evaluation of the integrals 


oo 

(S’) = j dri J,+1 [R nl (r)] 2 dr (8.25) 

0 

where R n i(r ) has been defined in Eq. 8.21 and the extra 2 in r p+2 comes from the 
term r 2 in the probability distribution generated by the hydrogen-like wave function, 
Eq. 8.23. Direct calculation of the integral in Eq. 8.25 is a hopeless task given the 
complexity of the radial function, but it is possible to circumvent the problem by 
relying on the radial equation, Eq. 8.7, itself, rather than on the explicit form of its 
solution, Eq. 8.21. 
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But first, let me derive a remarkable relation between the expectation values 
of kinetic and potential energies of a quantum particle known as a virial theorem. 
Consider the expectation value of the operator r p in an arbitrary quantum state and 
compute its time derivative using the Heisenberg picture of the quantum mechanics 
(the expectation values do not really depend on which picture is used, but working 
with time-dependent Heisenberg operators and time-independent states is more 
convenient than using the Ehrenfest theorem, Eq. 4.17, for Schrodinger operators): 





Applying Heisenberg equations for the position and momentum operators, Eqs. 4.28 
and 4.29, to this expression, I obtain 

<8 - 26) 

The left-hand side of Eq. 8.26 must vanish if the state used to compute the 
expectation value is an eigenvector of the Hamiltonian (a stationary state in the 
Schrodinger picture) because the expectation value of any operator in a stationary 
state is time-independent. This allows me to conclude that in the stationary states, 
the expectation values of kinetic and potential energies satisfy the relation 


2 (£j = (r-Vv) (8.27) 

known as virial theorem. In the case of the Coulomb potential of the hydrogen atom 
Hamiltonian, this theorem yields 


2 



Ze 2 


4jl £ r £o \ r 


(8.28) 


Since the expectation value of the Hamiltonian in its own stationary state is simply 
equal to the respective eigenvalue, I can write for the hydrogen-like Hamiltonian: 



Ze 2 


4jt£ r £o \ r 


F — 


Ze 2 II 


&7t£ r £0 \ r 


where I replaced the expectation value of the Hamiltonian with its eigenvalue for 
the n-th stationary state and used Eq. 8.28 to eliminate the expectation value of the 
kinetic energy. Finally, using Eq. 8.16 for E n , I have 
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Z 2 e 4 /r 1 Ze 2 /1\ 

327 T 2 £ 2 £q^ 2 n 2 &7l£ r £o \r]^ 

1 1 \ _ Ze 2 /r 1 _ Z 
\r/ AnSj-s^fi 2 n 2 dsn 2 


(8.29) 


where in the last step I used Eq. 8.8 for Bohr radius < 2 #. The expectation values ( r p ) 
for almost all other values of p can be derived using so-called Kramers’ recursion 
relations, which I provide here without proof: 


P+ 1 


(rP) 


(2P+I)f K 


- 1 > + §[( 2/+ 1 ) 2 -^ 2 ](0 


0. 


(8.30) 


It is easy to see that I can indeed use Eqs. 8.29 and 8.30 to find ( r p ) for any positive 
p, but Kramers’ relations fail to yield (r -2 ), because this term could arise if you set 
p = 0, but, unfortunately, the corresponding term vanishes because of the factor 
p in it. Therefore, I have to find an independent way of computing (r -2 ). Luckily, 
there exists a cool theorem, which Richard Feynman derived while working on his 
undergraduate thesis, called Feynman-Hellmann theorem. 1 The derivation of this 
theorem is based on the obvious identity, which is valid for an arbitrary Hamiltonian 
and which I have already mentioned when deriving Eq. 8.29. To reiterate, the 
identity states that 


E n = (x/r n \H \f n ) 

if |^ n ) are the eigenvectors of H. Now assume that the Hamiltonian H depends on 
some parameter A. It can be, for instance, a mass of a particle, or its charge, or 
something else. It is obvious then that the eigenvalues and the eigenvectors also 
depend on the same parameter. Differentiating this identity with respect to this 
parameter, you get 


dEn 

8X 



H\f n } + (ir n \H 



, , dH , 

(Y^nI IV'n 


The first two terms in this expression can be transformed as 
H\f n ) + (is n \H 


dfn 

dX 


df, 

dx 


:)-*((&'« Hit ))- 


3(^„| ^n) A 

3A = ° 


1 Hellmann derived this theorem 4 years before Feynman but published it in an obscure Russian 

journal, so it remained unknown until Feynman rediscovered it. 
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where I used the fact that all eigenvectors are normalized to unity, so that their norm, 
appearing in the last line of the above derivation, is just a constant. Thus, here is the 
statement of the Feynman-Hellmann theorem: 


= (831) 

This is a very simple, almost trivial result, and it is quite amazing that it can be used 
to solve rather complicated problems, such as finding the expectation value (r -2 ) 
in the hydrogen atom problem. So, let’s see how this is achieved. Going back to 
Eq. 8.7, you can recognize that this equation can be seen as an eigenvalue equation 
for Hamiltonian 


h 2 d ( 2 3 \ ti 2 l(l + 1) 1 Ze : 

2/zr 2 dr \ dr J 2/zr 2 4jt£ r £o 


and that hydrogen energies are eigenvalues of this Hamiltonian. Therefore, I can 
apply the Feynman-Hellmann theorem to this Hamiltonian, choosing, for instance, 
the orbital quantum number l as a parameter A. Differentiation of Eq. 8.32 with 
respect to / yields 


8H r _ fi 2 (2l + 1) 

~dT ~ 2/zr 2 

In order to find derivative dE n /dl , one needs to recall that the principal quantum 
number is related to the orbital number as n = l + n r + 1 so that 

3E n _ 3 E n _ Z 2 e 2 1 
3 1 3 n Ajt£ r £oaB n 3 

Now, applying the Feynman-Hellmann theorem, I can write 

h 2 (2l+l) I l\ _ Z 2 e 2 1 

2/z \r 2 / AjT£ r £oa B n 3 

where I used Eq. 8.17 for the energy. Rearranging this result and applying Eq. 8.8 
for the Bohr radius, I obtain the final expression for (r -2 ): 

/ 1 \ _ 1 _ 2Z 2 1 

\r 2 / 2nfi 2 £ r £oaB (21 + 1) n 3 a 2 B (2l-\-\)n 3 ^ ^ 

Now, boys and girls, if what you have just witnessed is not a piece of pure magic 
with the Feynman-Hellmann theorem working as a magic wand, I do not know what 
else you would call it. And if you are not able to appreciate the awesomeness of this 
derivation, you probably shouldn’t be studying quantum mechanics or physics at 
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all for that matter. This result is also a key to finding, with the help of Kramers’ 
relations, Eq. 8.30, of the expectation values ( r p ) for any p. For instance, to find 
(r -3 ), you just need to use Eq. 8.30 with p = — 1: 


a B I -2 


(r~ 2 )- 


4 Z 2 


(21 + \f - 1 


(r“ 3 ) = 0 


4Z 


a B (21 + 1) 


wmi) 


/(/+ 1)(2Z+ l)n 3 ' 


(8.34) 


If the sheer wonder at our ability to compute ( r p ) without using the unseemly 
Laguerre polynomials is not a sufficient justification for you to vindicate spending 
some time doing these calculations, you will have to wait till Chap. 14, where I 
will put this result to actual use in understanding the fine structure of the spectra of 
hydrogen-like atoms. 


8.4 Problems 


Problem 103 Using Eqs. 8.4 and 8.5 together with canonical commutation 
relations for single-particle coordinates and momentums, derive the commutator 
between relative position vector r and corresponding momentum p r to convince 
yourself that these variables, indeed, obey the canonical commutation relations. 

Problem 104 Verify that Eq. 8.10 defines a quantity of the dimension of energy. 

Problem 105 

1. Derive Eq. 8.14 by applying the power series method to Eq. 8.12 and carrying out 
the procedure outlined in the text. 

2. Find all radial functions with n = 1 and n = 2. Normalize them. 

Problem 106 Using the definition of the associate Laguerre functions provided in 
the text, find explicit expressions for the radial functions corresponding to the states 
considered in the previous problem. Normalize them and make sure that the results 
are identical to those obtained previously. 

Problem 107 An operator of the dipole moment is defined as d = er where e is 
the elementary charge and r is the position operator of the electron in the hydrogen 
atom. A dipole moment of a transition is defined as a matrix element of this operator 
between initial and final states of a system: d n i m yy m / = ( nlm\d Evaluate 

this dipole moment for the transitions between ground state of the atom and all 
degenerate states characterized by n = 2. 

Problem 108 Find the expectation values (r) , (1/r), and (r 2 ) for a hydrogen atom 
in 12,1, m) state. 
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Problem 109 Using the results of the previous problem and full 3-D Schrodinger 
equation with non-separated variables, find Find a relation between the 
expectation values of the potential and kinetic energies. 

Problem 110 A hydrogen atom is prepared in an initial state: 


0 (r 9 0) 


1 

7 ^ 


( 02 ,i,i (r, 0, <p) + 0 i,o,o (r 9 0 , <p)) . 


Find the expectation value of the potential energy as a function of time. 

Problem 111 Consider a hydrogen atom in a state described by the following wave 
function: 


where 


4>( r ) = Ri,o(r) + a— 


— V2x 

r 


RlAr) 


R n ,i(r) = ( r/na B ) l+l exp 



T 2/+1 

^n-l -1 


( 2 r/na B ) . 


1. Rewrite this function in terms of normalized hydrogen wave functions. 

2. Find the values of coefficient a that would make the entire function normalized. 

~ 2 

3. If you measure L and L z , what values can you get and with what probabilities? 

4. If you measure energy, which values are possible and what are their probabilities? 

5. Find the probability that the measurement of the particle’s position will find it in 
the direction specified by the polar angle 6 —44° < 6 < 46°. 

6. Find the probability that the measurement of the particle’s position will find the 
particle at a distance 0 .5a B < r < a B from the nucleus. 



Chapter 9 

Spin 1/2 


* 

Check for 
updates 


9.1 Introduction: Why Spin? 

The model of a pure spin 1/2, detached from all other degrees of freedom of a 
particle, is one of the simplest in quantum mechanics. Yet, it defies our intuition and 
resists developing that pleasant sensation of being able to relate a new concept to 
something that we think we already know (or at least are used to thinking about). 
We call this feeling “intuitive understanding,” and it does play an important albeit 
mysterious role in our ability to use new concepts. The reason for this difficulty, of 
course, lies in the fact that spin is a purely quantum phenomenon with no reasonable 
way to model it on something that we know from classical physics. While the 
only known to me bulletproof remedy for this predicament is practice, I will try 
to somehow ease your pain by taking the time to develop the concept of spin and by 
providing empirical and theoretical arguments for its inevitability. 

Experimentally spin manifests itself most directly via interaction between elec¬ 
trons and magnetic field and can be defined as an inherent property of electrons 
responsible for this interaction. This definition is akin to the definition of electric 
charge as a property responsible for electron’s interaction with the electric field or 
of mass as a characteristic determining electron’s acceleration under an action of a 
force. The substantial difference, of course, is that charge and mass are immutable 
scalar quantities, our views of which do not change when we transition from classi¬ 
cal to quantum theories of nature. The concept of spin, on the other hand, is purely 
quantum and embodies two distinct types of entities. First is a Hermitian vector 
operator, characterized by two distinct eigenvalues and corresponding eigenvectors, 
which specify the possible experimental outcomes when one attempts to measure 
spin. Second are the spinors—particular type of vectors subjected to the action of 
the spin operator and representing various spin states; they control the probability 
of one or another outcome of the measurement. 
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To untangle the connections between spin, angular momentum, and magnetic 
interactions, let me begin with a simple example of a classical electron moving 
along a circular orbit of radius R with period T. Taken literally, this example does 
not make much sense, but it does produce surprisingly reasonable results, so it can 
be considered as a convenient and meaningful metaphor. So, imagine an observer 
placed at some point on the orbit and counting the number of times the electron 
passes by during some time t T. The number of “sightings” of the electron, n , 
is related to the duration of the experiment t and the period T as / = nT. The total 
amount of charge that passes by the observer is obviously q = ne = et/T, where e 
is the elementary charge. The amount of charge passed across per unit time is what 
we call the electric current, which can be found as I = q/t = et/ ( Tt ) = e/T. This 
crude trick replaced a circulating electron by a stationary electric current, which, 
of course, only makes sense if I spread the entire charge of the electron along its 
orbit by some kind of averaging procedure. But as I said, I am treating this model 
only as a metaphor. Accepting this metaphor, I can follow up by remembering that 
the interaction between a steady loop of current and a uniform magnetic field is 
described by the loop’s magnetic dipole moment, ^ defined as /jl = I An, where A 
is the area of the loop and n is the unit vector normal to the plane of the loop with 
direction determined by the right-hand rule (do you remember the right-hand rule?). 
In the case of the orbiting electron, the loop area is A = itR 2 , so I have 


ev 


iLj = — tc R n = - tt R n 

* L T 2 ttR 


em e vR 
2 m e 



(9.1) 


where I (a) expressed period T in terms of the circumference 2irR and orbital 
velocity v: T = 2nR/v, (b) multiplied the numerator and the denominator of 
the resulting expression by electron’s mass m e , and (c) recognized that m e vRn 
is a vector, which is equal in magnitude and opposite in direction to the orbital 
momentum of the electron L. To figure out the “opposite” part of the last statement, 
recall that the magnetic moment is defined by the direction of the current—motion 
of the positive charges—while the charge of our orbiting electron is negative, and, 
therefore, it rotates in the direction opposite to the current. Equation 9.1 establishes 
the connection between the magnetic dipole moment of the electron and its orbital 
angular momentum. 

The interaction between a classical magnetic dipole and a uniform magnetic field 
B can be described by a potential energy: 


U B = —fi L - B = ^—L B. (9.2) 

2 m e 

According to this expression, the potential energy has a minimum when the 
magnetic dipole is oriented along the magnetic field and a maximum when they 
are oriented antiparallel to each other. For both these orientations, the torque on the 
dipole r = ji L x B is zero, so these are two equilibrium positions, but while the 
former is the stable equilibrium, the latter is unstable. Equation 9.2 also establishes 
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the connection between the potential energy U B and electron’s angular momentum 
L, which is quite useful for transitioning to quantum description. Quantization in 
this case consists merely in promoting the components of the angular momentum to 
the status of the operators. This newly born operator U B can now be added to the 
Hamiltonian Ho describing the electron in the absence of the magnetic field to yield 

H = H 0 + -J-B'L. (9.3) 

2m e 

Ho , for instance, can describe an electron moving in some central potential V(r) (the 
Coulomb potential would be a good example), and I will assume that its eigenvalues 
E n j and eigenvectors | n, /, m) are known. The choice of notation here reflects the 
fact that the eigenvectors of the Hamiltonian with central potential must also be the 
eigenvectors of angular momentum operators L 2 and L z and that its eigenvalues do 
not depend on magnetic quantum number m. 

It is quite easy to verify that if I choose the polar (Z)-axis of the coordinate system 
in the direction of the uniform magnetic field B , eigenvectors | n, /, m) of Ho remain 
also eigenvectors of the total Hamiltonian given by Eq. 9.3. The corresponding 
eigenvalues are found as 

/ „ eB \ eB 

( Ho H- L z ) | n, /, m) = E n /1 n, /, m) H- fim \n, /, m) = 

y 2 m e J ’ 2 m e 

(E n i + fi m ] | n, /, m) . (9.4) 

V 2 m e J 

The combination of fundamental constant eti/2m e has a dimension of magnetic 
dipole moment and is prominent enough to warrant giving it its own name. Bohr 
magneton \i B is defined as 



(9.5) 


so that the expression for the energy eigenvalues can be written down as 

En,Um = E n,i + m^ B B. (9.6) 

Term mfisB can be interpreted as the energy of interaction between the uniform 
magnetic field and a quantized magnetic moment with values which are multiples 
of /is • In this sense, the Bohr magneton can be thought of as a quantum of 
magnetic dipole moment. The most remarkable prediction of this simple compu¬ 
tation is the m-dependence of the resulting energy levels, which is responsible for 
lifting the original 27+1 degeneracy of the energy eigenvectors. Since magnetic 
field is the primary reason for this, it seems quite natural to give quantum number m 
the name of “magnetic” number. 
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Experimentally, this degeneracy lifting is observed via the Zeeman effect— 
splitting of the absorption or emission spectral lines in the presence of the magnetic 
field. I will discuss the relation between the absorption/emission of light and atomic 
energy levels in more detail in Part III of the book, but at this point, it is sufficient 
to recall old Bohr’s postulates, one of which relates frequencies of the absorbed or 
emitted light to atomic energy levels: 



where a, /3 are composite indexes replacing groups of n,l, m for the sake of 
simplicity of notation. So, if you, say, observe a light emission due to the transition 
from the first excited state of hydrogen atom with n = 2 to the ground state, in 
the absence of magnetic field, you would see just one emission line formed by 
transitions between states |2,0,0), |2,1, —1), |2,1,0), and \2, 1,1), all of which 
have the same energy, E 2 = —E/4, where E was defined in Eq. 8.10. When the 
magnetic field is turned on, two of these states, |2, 1 , —1) and |2, 1 , 1), acquire 
magnetic field-related corrections: 

E 2 -\ = —E/4 — /in B 
E 2 ,i = -E/4 + ii b B , 

making their energy different from each other and E 2 ,o . As a result, instead of a 
single emission line with frequency co = (E 2 —E\)/fi = 3E/4fi, an experimentalist 
would observe three lines at frequencies: 

(O-i = 3E/4h - ^-B 
n 

coo = 3E/4fi 

«i = 3E/4h + ^-B. 

n 

You should not think though that by deriving Eq. 9.4, I completely solved the 
Zeeman effect. The actual problem is much more complicated and involves addition 
of orbital and spin magnetic moments, as well as multi-electron effects, relativistic 
corrections, magnetic moment of nuclei, etc. What I did was just an illustration 
designed to make a particular point—the magnetic field lifting of the 2/ + 1 
degeneracy of atomic levels gives rise to the odd number of closely positioned 
spectral lines. While for some atoms the odd number of lines is indeed observed, a 
large number of other observations manifest splitting into even number of lines. This 
phenomenon, called anomalous Zeeman effect, cannot be explained by interaction 
with orbital magnetic moment, because an even number of lines implies half-integer 
values of /. To explain this effect, we have to admit that in addition to “normal” 
orbital angular momentum, electrons also have another magnetic moment, which 
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cannot be constructed from the coordinate and momentum operators and has to be, 
therefore, an intrinsic property of the electron not related to other regular (spatial- 
temporal) observables. The lowest number of observed split lines was equal to two. 
Equating 21 ± 1 to 2, you find that this splitting corresponds to / = 1/2. If you also 
recall that / is the maximum value of the magnetic number m, you might realize that 
m in this case can have only two values m = ±1/2. 

A meticulous and mischievous reader might ask, of course, if it is absolutely 
necessary to derive a magnetic dipole momentum from an angular momentum. Can 
a magnetic momentum exist just by itself with no angular momentum attached to 
it? The answer to the first question is yes and to the second one is, obviously, no. 
To justify these answers, however, is not so easy, and the path toward realizing 
that electrons do possess an intrinsic angular momentum, which can be in one of 
two possible states, was a long one. Such physicists as Wolfgang Pauli (Austria- 
Switzerland-USA) and Arnold Sommerfeld (Germany) recognized very early that 
purely orbital state of electrons proposed in Bohr’s model of atoms could not 
explain all experimental data, which consistently indicated that the actual number 
of states is double of what Bohr’s model predicted. Pauli was writing about the 
“two-valuedness” of electrons in early 1925 as he needed it to explain the structure 
of atoms and formulate its famous Pauli exclusion principle. Later in 1925 two 
graduate students of Paul Ehrenfest from Leiden, the Netherlands, Goudsmit and 
Uhlenbeck, published a paper, in which they proposed that the required additional 
states come from intrinsic angular momentum of electrons due to their “spinning” 
on their own axis. They postulated that this new angular momentum of electrons, S , 
is related to its magnetic moment fi s in a way similar to the relation between orbital 
momentum and orbital magnetic moment, but in order to fit experimental data, they 
had to multiply the Bohr magneton by 2: 

!l s = -2x^S = -2 f^-S. (9.7) 

2 m e n 

The idea appeared so ridiculous to many serious physicists (such as Lorentz) that 
the students almost withdrew their paper, but luckily for them (and for physics), 
it was too late, and the paper was published. Eventually, it was recognized that 
while it was indeed wrong to think that a point-like particle such as electron can 
actually spin about its axis (estimates of the required spinning speed would put it 
well above the speed of light), so this classical mechanistic interpretation had to go, 
the idea of the intrinsic angular momentum, which “just is” as one of the attributes 
of electrons, survived, committing the names of Goudsmit and Uhlenbeck to the 
history of physics. Ironically, this was the highest achievement of their lives: they 
both made decent careers in physics, moving to the USA and securing respectable 
professorial positions, but they have never did anything as significant as their almost 
withdrawn student paper on spin. 

There are other purely theoretical arguments for understanding spin as a different 
kind of the angular momentum, but this discussion is for a different time and a 
different book. At this point let me just mention that if we want to be able to 
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add orbital angular momentum and spin angular momentum, which is absolutely 
necessary to explain a host of effects in atomic spectra, we must require that they 
both are described by objects of the same mathematical nature. This means that 
if the orbital momentum is described in quantum mechanics by three operator 
components L x , L y , and L z of the angular momentum vector with commutation 
relations given by Eqs. 3.53-3.55, spin angular momentum must also be described 
by three operator components S x , S y , and S z with the same commutation relations. 
Our calculations in Sect. 3.3.4 demonstrated that these commutation relations ensure 
that one of the operator components (usually it is chosen to be the z-component) 
and the operator of the square of the angular momentum L 2 (or S 2 ) can have a 
common system of eigenvectors characterized by a set of two eigenvalues, fimi for 
the z-component and h 2 l(l + 1) for the square operator, where m/ can take values 
mi = —/,—/+ 1, •••,/— 1, / and can be either integer or half-integer. The results 
of Sect. 5.1.4 indicated that orbital angular momentum can only be characterized by 
integer eigenvalues, but, as you can see, half-integer values are needed to deal with 
the spin angular momentum. It is amusing to think that nature tends to find use for 
everything, which appears in abstract mathematical theories! To distinguish between 
spin and orbital moments, I will replace notation l for the maximum eigenvalue of 
operator L z with notation s for the maximum eigenvalue of S z . The lowest value that 
s can take is 1/2, which means that there are only two possible eigenvalues of this 
operator, —fi/2 and hi2. The eigenvalue of the operator S 2 is h 2 s (s + 1) = 3h 2 /4, 
but it is the value of s that we have in mind when we are talking about electron 
having spin 1/2. Thus, Pauli’s two-valuedness of the electron comes here in the form 
of two eigenvectors and two eigenvalues of the z-component of the spin operator. 
The idea that the spin is an intrinsic and immutable property of electrons means 
that the 1/2 value of quantum number s (or 3h 2 /4 eigenvalue of operator S 2 ) is as 
unchangeable as an electron’s mass or charge, but at the same time, the electron can 
be in various distinct spin states described by eigenvectors of S z or their arbitrary 
superposition. 


9.2 Spin 1/2 Operators and Spinors 

While spin 1/2 operators are characterized by the same commutation relations as 
the operator of the orbital angular momentum, they act on vectors that live in a 
two-dimensional space of spin states or spinors. There are no reasons to panic at 
the sound of the unfamiliar word. The term spinor is used to describe a specific 
class of abstract vectors, which have all the same properties as any other vectors 
belonging to a Hilbert space, only much simpler because the dimensionality of the 
spinor space is just 2. One can introduce a ket spinor |/), its adjoint bra spinor 
(x\, and inner product of spinors (/| /'), which has the same property as any other 
inner products {x\ x') — ((/ 7 | /))*• A basis in this space can be formed by two 
eigenvectors of operator S z , for which physicists use several, different in appearance, 
but otherwise equivalent notations. Two of the popular ways to designate these 
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eigenvectors are |l/2) for the state belonging to the eigenvalue fi/2 and |—1/2) 
for its counterpart accompanying eigenvalue —fi/2. Alternatively, states with the 
positive eigenvalue are often called spin-up states with corresponding notation ), 
while states with the negative eigenvalue are called spin-down states and are notated 
as \l). The main difference between spinors and vectors representing other states of 
quantum systems is that the spinors do not have the coordinate representation. They 
exist separately from the vector spaces formed by the eigenvectors of position or 
momentum operators or any other observables related to them. Spinors describe 
intrinsic properties of electrons, while vectors from other spaces represent their 
extrinsic spatial-temporal states. 

This basis of the eigenvectors of operator S z can be used to construct a particular 
representation of spinors and spin operators—as I demonstrated about 100 pages 
ago in Sect. 5.2.3. Generic spinors in this basis are represented by 2 x 1 column 
vectors: 


\x) = 


= a 


+ b 


(9.8) 


where |t) = 


represents the spin-up or m = 1/2 eigenvector, while ||) = 


represents the spin-down or m = —1/2 eigenvector. The representation of the 


respective bra vector is given by 


{x\ = [a* b*] = a* [1 0] + b* [0 l], (9.9) 


and the norm is 


( X \x)=a*a + b*b. 

(9.10) 

Normalized spinors obviously obey the condition 

i«i 2 + i^i 2 = i. 

(9.11) 


Spin operators S x , S y , and S z defined with respect to a particular Cartesian coordinate 
system are represented in the basis of the eigenvectors of S z by two-by-two matrices 
derived in Sect. 5.2.3: 


S x 

Sy 


h ro r 

2 L 1 °. 

t> "0 —i 

2 i 0 


A _ ft fl o 

z 2 0-1 


(9.12) 

(9.13) 

(9.14) 
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Equations 9.12-9.14 are just a recapitulation of Eqs. 5.110 and 5.111 from 
Sect. 5.2.3, which I placed here for your convenience. Spin operators are often 
expressed in terms of so-called Pauli matrices g x , & y , and g z defined as 


0 1 
1 0 

0 -f 
i 0 _ 

1 0 
0 -1 


(9.15) 

(9.16) 

(9.17) 


These matrices have a number of important properties such as 



which means that they are simultaneously Hermitian and unitary, and 


(9.18) 


G x Gy + GyG x = 0 

g x g z + g z g x = 0 (9.19) 

G z Gy + GyG z = 0, 

which is often expressed as an anticommutativity property. Pauli matrices are used 
quite often in quantum mechanics, so it makes sense to acquaint yourselves with 
their properties. For instance, one can prove that the property expressed by Eq. 9.18 
is valid for any matrix of the form G n = a n, where n is an arbitrary unit vector 
and <7 is a vector with components given by Pauli matrices. Using the presentation 
of the unit vector in spherical coordinates 

n x = sin 0 cos cp 

n y = sin# sin<^ (9.20) 

n z = cos 9, 


where 9 and (p are polar and azimuthal angles defining the direction of n with respect 
to a particular system of Cartesian coordinate axis (see Fig. 9.1), you can derive for 
the matrix G n = sin 6 cos cpG x + sin 6 sin (pG y + cos 6g z \ 


cos 0 sin 6e l(p 
sin 9e l(p — cos 0 
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Fig. 9.1 Unit vector in the 
Cartesian coordinate system 



Squaring it will get you 


ol = 


cos 9 sin 9e l(p 
sin 6e l(p —cos 9 


cos 9 sin Oe l(p 
sin 6 e l(p — cos 9 


cos 2 9 + sin 2 9 cos 9 sin Oe l(p — cos 6 sin 0e l(p 

cos 9 sin 9e l(p — cos 6 sin 9e l(p cos 2 9 + sin 2 9 


1 0 
0 1 


This property makes the evaluation of various functions with Pauli matrices as 
arguments relatively easy. One of the popular examples is the exponential function 
exp {id n), which you will enjoy computing when you get to the problem section 
of this chapter. 

To help you become more comfortable with spin operators, I will now consider 
a few examples. 

Example 21 (Measurement of the y-Component of the Spin) Assume that a single 
unmovable electron is placed in the state described by the spin-up eigenvector of 
operator S z . Using magnetic field directed along the T-axis of the coordinate system, 
you are probing possible values of the y-component of the spin. What are these 
possible values and what are their probabilities? 

Solution 

As with any observable, possible results of its measurement are given by the 
eigenvalues of the respective operator. In this case this operator is S y and you need to 
determine its eigenvalues. The answer is, of course, obvious {-\-ti /2 and —ti/ 2), but 
let’s play the game and compute it. Besides, along the way you will determine the 
eigenvectors, which you need to answer the probability question. So, the eigenvector 
equation is 
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0 —i 

a 

= -A 

a 

i 0 

A 

2 

A 


which produces a set of two linear equations 

—ib = Xa 


ia = Xb. (9.21) 

The condition for the existence of nontrivial solutions given by zero of the 
determinant 


X i 
—i X 


becomes A 2 — 1 =0, yielding X\^ = =b 1. Thus, recalling factor ti/2 that I prudently 
pulled out, you can conclude that the eigenvalues are, indeed, as predicted ±ti/2. 
The first eigenvector is found by substituting A = 1 to Eq. 9.21. This gives a = —ib , 
so that the respective eigenvector can be written as 


h/2y) = b 


1 

vi 


—l 

1 


(9.22) 


where at the last step I normalized it requiring that 2 \b\ 2 = 1. Repeating this 
procedure with A = —1,1 find 


-*/2y) 


— K 
vx Ia ’ 


(9.23) 


Taking into account that the initial state was \ti/2 z ) 
probabilities of the corresponding eigenvalues are 


1 

0 


I find that the 


P±h/2 = \[±h/ 2 y\ h/ 2 z) 


[±i 1] 


1 

0 


1 

2' 


Not a huge surprise, really. 

Example 22 (Measurement of the Arbitrary Directed Planar Spin) What if we want 
to measure a component of the spin along a direction not necessarily aligned with 
one of the coordinate axes? Let me consider an example in which the measured 
component of the spin is in the Y-Z plane at an angle 9 with the Z-axis and find 
possible outcomes and their probabilities assuming the same initial state as before. 
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Solution 

I can define the specified direction by a unit vector with y-component sin 9 and z- 
component cos 9. Introducing unit vectors e y and e z along the respective axis, this 
vector can be conveniently presented as n = e y sin 9 + e z cos 9. The component of 
the spin in the direction of n is given by a dot product S n = S-n = S y sin 9 +S Z cos 9. 
Using the matrix representation of the spin operators in the basis of the eigenvectors 
ofS z ,Eqs. 9.12-9.14, 1 find for S n 


'0 -i 

fi . 

H— cos 9 

"1 0 " 

h 

cos 9 

—i sin 9 

_ i 0 _ 

2 

_0 - 1 _ 

~ 2 

i sin 9 

— cos 9 _ 


The respective eigenvector equation becomes 


cos 9 

—i sin 9 

a 

= A 

a 

i sin 9 

— cos 9 _ 

A 


A 


and the equation for the eigenvalues takes the form 


cos 9 — X —i sin 9 
i sin 9 — cos 9 — X 


= — (cos 9 — X) (cos 9 + X) — sin 2 9 = X 2 — 1 = 0. 


I am not going to pretend that I am surprised that the eigenvalues are again ±h/2 as 
what else can they be? 

Equations for the eigenvectors can be written as 
1. A = 1 


a cos 9 — ib sin 9 = a 

—ib sin 9 = a (1 — cos 9) 

. 9 9 o . 2 9 

—2ib sin — cos —= 2a sin — 

2 2 2 

o ■ 9 

—ib cos — = a sin —. 

2 2 

There are, of course, multiple choices of the coefficients in this equation, but I 
want to make the final form of the eigenvector as symmetric as possible, so I will 
choose a = A cos | and b = iA sin |, which obviously satisfy the equation with 
an arbitrary A. The latter can be found from the normalization condition \a\ 2 + 
\b \ 2 = 1, which obviously gives A = 1. Now, I can write the first eigenvector as 

'cos f 
i sin | 


IW = 


(9.24) 




















284 


9 Spin 1/2 


2. A = -1 


a cos 0 — ib sin 6 = —a 

ib sin 0 = a (1 + cos 6) 

. 6 9 o 2 0 

2 ib sin — cos — =2a cos — 

2 2 2 

0 0 

ib sin — = <3 cos —. 

2 2 


Using the same trick as previously, I find this eigenvector to be 


\-*/u 


sinf 
—i cos | 


(9.25) 


The direction described by 9 = nil corresponds to unit vector n pointing in the 
direction of the T-axis, reducing this example to the previous one. Naturally, you 
would expect the eigenvector found here to reduce to the respective eigenvectors 
from the previous example. However, by substituting 0 = n /2 into Eqs. 9.24 
and 9.25, you find that the resulting vectors do not coincide with Eqs. 9.22 
and 9.23. Did I do something wrong here? Not really, because it is easy to 
notice that the difference between the two results is a mere factor of /, and we 
know that multiplication of an eigenvector by i or by any other complex number 
of the form exp (icp), where cp is an arbitrary real number, does not change a 
quantum state and has no observable consequences. Finally, the probabilities that 
the measurements of the spin will produce one of the found eigenvalues are 


Ph/2 = 


[cosf -i Sinf] 


1 

0 


2 


= cos 2 


9 

2 



r • o • o i 

T 

-h/2 ~ 

[sm | z cos !J 

_ 0 _ 


2 


= sin 2 


0 

2 ' 


I can also use this result to find the expectation value of the operator S n in the state 
^ . The probabilistic definition of the mean J2 x iPi> where x t is the value of the 
variable and p t is its probability, yields 

= (ft/2) cos 2 - (ti/2) sin 2 | = (ft/2) cos 9, 

which is exactly the value you should have expected from a classical vector oriented 
along the Z-axis when computing its component in the direction of n. The same 
result is obtained by computing the expectation value using the operator definition: 
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(s n ) = <mitz) 



cos 6 —isinO 
isinO — cos 0 


1 

0 


fi 

2 


[1 0 ] 


cos 9 
i sin 6 



0. 


Example 23 (Measuring of the Z-Component in an Arbitrary Spinor State) You can 
also ask a question, what if the spin was prepared in a state presented by one of the 
eigenvectors of S n , say, \ fi/2 n ) and we were measuring the z-component of the spin? 
What would be the probabilities of obtaining fi/2 or — fi/2 and the expectation value 
of S z in this situation? 

Solution 

The corresponding probabilities are given by the following expressions: 


[i 


0] 


a 2 

COS | 

i sin |J 


= cos 2 


0 

2 


[0 


1] 


' cos f ' 
i sin |0 


sin 2 


0 

2 


yielding exactly the same results. Obviously the expectation value will also be the 
same. 

These examples were designed to prepare you to answer an important but rather 
confusing question. The concept of spin is supposed to represent a vector quantity 
existing in our regular physical three-dimensional space. At the same time, the 
quantum objects used to describe spin, operators, and spinors have little relation 
to this space. While spin operators do have three components, they are not regular 
vectors, and the question about the “direction” of a vector operator does not make 
much sense. Spinors, representing spin states, are objects existing in an abstract two- 
dimensional space. Thus, the question is how these objects are connected with the 
physical space in which all our measurement apparatuses live. One might attempt 
to deflect this question by saying that after taking the expectation values of the 
spin operators for a given spin state, we will end up with a regular vector, which 
will provide us with the information about the spin and its direction. I can counter 
this by saying that this information is very limited. Indeed, I can also compute 
the uncertainty of each spin component, which will also give me a regular vector. 
The problem is that in the most generic situation, the vector obtained from the 
expectation values and the vector obtained from the uncertainties do not have to have 
the same direction, making it difficult to come up with a reasonable interpretation of 
these results. One way to avoid this ambiguity is to focus on eigenvectors, in which 
case expectation values provide the complete description of the situation. You only 
need to figure out the connection between the spatial direction, spin operators and 
corresponding eigenvectors. 
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One way to answer this question is to do what we just did in the previous 
example: introduce a component of the spin operator in the direction of interest, find 
its eigenvectors, and analyze their connection to this direction. But I want to add a 
bit more intrigue to the issue and will use a different approach. Let me ask you this: 
what is the best way to write down a generic spinor? Equation 9.8, which does it by 
introducing two complex parameters, a and b , is too general and does not contain 
all the information available about even the most generic spin states. Indeed, two 
complex numbers contain four independent real parameters, which can be brought 
out explicitly by writing a and b in the exponential form: a = \a \ exp (icj) a ) and 
b = \b\ exp 1 can do better than that and reduce the number of parameters to 
just two without making the spinor any less generic. 

First, I am going to use the freedom of choice of the overall phase of the spinor. 
To this end I will multiply both a and b by exp [—i {fo + cp a ) / 2], bringing the spinor 
in the following form: 


lx) 


\a\ exp {—icp/ 2) 
\b\ exp (i(p/2) 


where <p = <p a ~ <fib, and there are only three parameters left to worry about. 
Obviously, this is not the only way to eliminate one of the phases, but this one 
presents the spinor in a rather symmetric form, and just like all physicists, I have 
a sweet spot for symmetry. Besides, frankly speaking, I know where I want to 
go and just taking you along for the ride. The normalization imposes additional 
condition on these parameters, telling me that I can use it to eliminate another one 
of them reducing the total number to just two. After a few seconds of staring at 
Eq. 9.11, it can descend upon you that this equation looks similar to the fundamental 
trigonometric identity cos 2 v + sin 2 x = l and that you can automatically satisfy 
the normalization condition by choosing \a\ = cos (0/2) and \b\ = sin (0/2), 
expressing both \a\ and \b\ in terms of a single parameter 0/2. If you are asking why 
0/2 and not just 0, you will have the answer in a few minutes, just keep reading. 
Now, as promised, I have the expression for the generic normalized spinor: 


Ixi) = 


cos (0/2) exp {—icp/ 2) 
sin (0/2) exp {icp / 2) 


(9.26) 


with only two parameters, 0 and cp. The choice I made for \a\ and \b\ is not 
unique, and I can generate another spinor by assigning \a\ = sin (0/2) and 
\b\ = — cos (0/2): 


|X2> = 


sin (0/2) exp {—icp/ 2) 
— cos (0/2) exp {icp/ 2) 


(9.27) 


It is easy to verify by computing (/\ \ X 2 ) that these spinors are orthogonal (of 
course, I designed them with this particular goal in mind), and by generating 
matrices 
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\Xi) (X ll 


cos 2 (0/2) cos (0/2) sin (0/2) exp (—i(p) 

cos (0/2) sin (0/2) exp (i(p) sin 2 (0/2) 


and 


1 / 2 ) {*21 = 


sin 2 (0/2) — cos (0/ 2) sin (0/2) exp (—icp) 

— cos (0/2) sin (0/2) exp (icp) cos 2 (0/2) 


you can also check that 


\xi) (xi\ + \xi) (xi\ =1 

indicating that these two spinors form a complete set. (When trying to reproduce 
these calculations, do not forget complex conjugation when converting kets into 
respective bra vectors.) 

Thus, with little efforts, I have constructed a complete set of two generic mutually 
orthogonal spinors characterized by parameters, which can be interpreted as angles, 
and this must mean something. The found representation of spinors establishes a 
one-to-one relationship between two-dimensional space of spin states and points 
on the surface of a regular three-dimensional sphere of unit radius (see Fig. 9.2). 
The points at the north and south poles of the sphere, characterized by 0 = 0 
and 0 = 7t, describe the eigenvectors of S z operators |j") and ||), respectively 
(angle cp is not defined for these points, but it is not a problem because respective 
factors exp(— icp/2) become in these cases simply insignificant phase factors). It 
is also easy to notice that the antipodal points lying on the opposite ends of an 
arbitrarily oriented diameter of the sphere correspond to two mutually perpendicular 
spin states. Indeed, spherical coordinates of the antipodal points are related to each 
other as 02 = Tt — 0\, cp 2 = cp\ + n. Substituting these expressions into Eq. 9.26, 
you will immediately obtain the spinor presented in Eq. 9.27 . While performing this 
operation, you can appreciate the wisdom of using half-angles 0/2 and cp/ 2 in these 
expressions. 

In order to further figure out the physical meaning of the mapping between 
spinors and directions in regular 3-D space, consider the same operator, S n = S • n, 
which I discussed in the preceding example, but with a unit vector n defining a 
generic direction characterized by the same angles 0 , cp as in Fig. 9.2. This is the 
same vector which I introduced in connection with Pauli matrices, Eq. 9.20, so that 
the operator S n becomes 

cos 0 sin 0e~ l(p 
sin 0e l(p —cos 0 









288 


9 Spin 1/2 


Fig. 9.2 The Bloch sphere: 
each point on the surface 
characterized by spherical 
coordinates 0, cp corresponds 
to a particular spin state 



Now, let me apply this operator to the spinors presented in Eq. 9.26: 


ti cos 9 
2 sin Qe* 


sin#£ l(p 
— cos 9 


cos (9/2) exp (—icp/2) 
sin (9/2) exp (icp / 2) 


fi cos 9 cos (9/2) exp (—icp/2) + sin 9 sin (9/2) exp (—icp/2) 
2 sin 9 cos (9/2) exp (icp/2) — cos 9 sin (9/2) exp (icp/2) 


fi cos (9/2) exp (—icp/2) (cos 9+2 sin 2 (9/2)) 
2 _ sin (9/2) exp (icp/2) (2 cos 2 (9/2) — cos 9) 


fi cos (9/2) exp (—icp/2) (2 cos 2 (9/ 2) — 1 + 2 sin 2 (9/2)) 
2 sin (9/2) exp (icp / 2) (2 cos 2 (9/2) — 2 cos 2 (9/2) + l) 


fi cos (9/2) exp (—icp/2) 
2 _ sin (9/2) exp (icp/2) 


Isn’t that nice? A generic spinor with arbitrarily introduced parameters 9 and cp 
turned out to be an eigenvector of an operator representing the component of the 
spin in the direction defined by these parameters. It probably would not come as 
a particularly great surprise now that the second eigenvector I conjured up is also 
an eigenvector of the same operator but corresponding to the second eigenvalue, 
namely, —fi/2. (Check it out as an exercise. And by the way, did you notice that 
in the course of this computation, I used a couple of trigonometric identities such 
as cosv = 2cos 2 v/2 — 1 and sinx = 2 sinv/2cos x/21) This exercise allows us 
to give more substance to an already established connection between spinors and 
directions in physical space: each spinor parametrized as in Eq. 9.26 or 9.27 is an 
eigenvector of a component of the spin in the direction specified by parameters 9 
and cp interpreted as spherical coordinates of the corresponding unit vector lying 
on the surface of the Bloch sphere. The measurement of the spin in this direction 
will yield definite results corresponding to the respective eigenvalue, so it can be 
interpreted as the direction of the spin for this particular spin state. It also makes 
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sense that antipodal points on the Bloch sphere represent eigenvectors belonging 
to opposite eigenvalues of S n . Finally, by now, I hope you have the answer to the 
question why I used half-angles in the definition of the spinors. 


9.3 Dynamic of Spin in a Uniform Magnetic Field 

A bound (for instance, by attraction to a nucleus) electron in a uniform magnetic 
field, the system used earlier to introduce the Zeeman effect, is also the simplest 
somewhat realistic physical model allowing one to study quantum dynamics of a 
pure spin. Assuming that the interaction between the spin and the magnetic field 
does not affect the orbital state of the electron, one can ignore energy associated 
with the latter and omit the atomic part of the Hamiltonian (remember, energy only 
matters when it changes, and if it does not, we can always make it equal to zero). 
The Hamiltonian of this system is obtained by dropping H 0 term from Eq. 9.3 and 
replacing orbital angular momentum L with 2 S, where factor 2 takes into account 
the empirically established modification of the connection between spin angular and 
magnetic momenta, Eq. 9.7. The resulting Hamiltonian takes the form 



(9.28) 


Note that magnetic field B is not an operator because it describes a classical 
magnetic field created by a source whose physics is outside of our consideration. 
Since the field is uniform, it makes sense to use its direction as one of the axes of the 
coordinate system, which I have to specify in order to be able to carry out subsequent 
calculations. It is customary to choose axis Z as the one which is codirected with 
the magnetic field, in which case Hamiltonian 9.28 significantly simplifies 



(9.29) 


In this section I will discuss the dynamics of spin described by this Hamiltonian 
using both Schrodinger and Heisenberg pictures of quantum mechanics. 


9.3.1 Schrodinger Picture 

In the Schrodinger picture, we always begin by establishing the eigenvalues 
and eigenvectors of the Hamiltonian. It is obvious that the eigenvectors of the 
Hamiltonian given by Eq. 9.29 coincide with those of operator S z , which I will 
denote here as \\) with eigenvalue ti/2 (spin-up) and ||) with eigenvalue — ti/2 
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(spin-down). The respective eigenvalues of the Hamiltonian are quite obvious and 
are 


E\ = \i b B 

= -HbB. (9.30) 

A solution of the time-dependent Schrodinger equation for an arbitrary time- 
dependent spinor \x(t)) 


., 91 * 0 )) ^VBB e 
-in 9 = 2 — 1 * 0 ) > 


can be presented as a linear combination of the stationary states of the Hamiltonian 

1*0)) = a exp It) + bexp (-i- ||) (9.31) 

where coefficients a and b are determined by the initial state of the spin 

|*(0)> = alt) + b IP. (9.32) 


Equation 9.31 essentially solves the problem of the dynamics of a single spin in a 
uniform magnetic field. It, however, does little to develop our intuition about the 
physical phenomena, which this solution describes. In a typical experimental situ¬ 
ation, one is rarely dealing with a single spin. Most frequently, an experimentalist 
would measure a signal from an ensemble of many spins, and if we can neglect any 
kind of interaction between them, as well as assume that all spins are in the same 
initial state, 1 the experimental results can be described by finding the expectation 
values of the spin operators. So, let me compute these expectation values for the 
state described by Eq. 9.31. 

To this end I will use the representation of a generic spinor in the form of Eq. 9.26 
and rewrite coefficients a and b as 


a = cos (0/2) exp (—icp/2) 

b = sin (0/2) exp (i<p/2 ). (9.33) 

Substituting these expressions for a and b into Eq. 9.31 and using regular represen¬ 
tation of basis spinors \\) and ||), you can find 


'The assumption about the same initial state is the most difficult to realize experimentally and can 
be justified only at zero temperature. 
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1 /( 0 ) = 


cos (6/2) exp (—icp/ 2) exp 

sin (6/2) exp (icp/2) exp (^—i^-tj 


It is easiest to compute the expectation value of S z : 


(£) = \ (|a| 2 - l^l 2 ) = ^cos6>. 


(9.34) 


(9.35) 


I derived this expression taking advantage of the fact that |f) and ||) in Eq. 9.31 
are eigenvectors of S z , and, therefore, coefficients in front of them (their absolute 
values squared, of course) determine the probabilities of the respective eigenvalues. 
To find the expectation values of two other components, I will have to do a little bit 
more work computing 


(Sx,y)= (*(0l4,ylx(0) • 

I begin with the v-component and first compute the right half of this expression 

4 1 /( 0 ): 


41 /( 0 ) = 

h 

2 


fl 

"0 f 

cos (6/2) exp (—icp/2) exp 

2 

_! 0 _ 

sin (6/2) exp (icp/2) exp (-i^-tj 


sin (6/2) exp (icp/2) exp i^-t^J 
cos (6/2) exp (—icp/2) exp [i^-t^ 


By the way, have you noticed how operator S x flips the components of the spinor? 
Anyway, to complete this computation, I find the inner product of this ket with the 
bra version of spinor in Eq. 9.31: 




cos (6/2) sin (6/2) exp (icp) exp 


+ cos (6/2) sin (6/2) exp (—icp) exp 
f 2 {IbB 


sin 6 cos 


V * 


-t — cp 


(-T') 

mv 

)■ 


Similar calculations with the y-component operator yield 




sin 6 sin 


. f 2 iibB 


V ^ 


t-cp). 
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Let’s collect all these results together to get the better picture: 

($z) = | cos e 

(4) = ^ Sin e cos - ‘p'j 

(Sy^ = ^ sin 0 sin ~ ■ (9.36) 

Here is what we have: a vector of length fi/2 remains at all times at angle 0 with 
respect to the magnetic field, but its projection on the X-Y plane of the coordinate 
system (which is perpendicular to the magnetic field) rotates with frequency 
co L = 2fi B B/fi = eB/m e , where I substituted Eq. 9.5 for the Bohr magneton. 
A remarkable fact about this result is the disappearance of Planck’s constant 
from the final expression for frequency, which signals that this phenomenon must 
exist in classical physics as well, and it, indeed, does. Equation 9.36 describes 
a very well-known effect—Larmor precession—which is observed every time 
when a magnetic moment (of any nature) interacts with a uniform magnetic field. 
However, the frequency of the precession might be different for different magnetic 
moments because of its dependence on the so-called gyromagnetic ratio defined as a 
coefficient of proportionality between the magnetic dipole moment and the angular 
momentum. For the orbital angular momentum, this ratio is —e/ (2 m e ) as given by 
Eq. 9.1, while for the spin, it is two times larger, resulting in twice as big precession 
frequency. 


9.3.2 Heisenberg Picture 

To describe spin precession in the Heisenberg picture, I have to solve Heisenberg 
equations 4.24 for spin operators. To simplify notations I will omit subindex H , 
which I used to distinguish Schrodinger from Heisenberg operators. However, it 
is important to note that the angular momentum commutation relations, Eqs. 3.53- 
3.55, remain the same for both pictures, provided that we take Heisenberg operators 
at the same time. If you do not see how to verify this statement, imagine 
sandwiching both sides of the commutation relation between operators exp ^ iHt/h ^ 

and exp (—iHt/h^ and also inserting the products of these operators (which is equal 
to unity, by the way) between the products of any two operators in the commutator. 
Thus, using the necessary commutation relation, I obtain the following equations: 



(9.37) 
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= -co L S y (9.38) 

= co L S x , (9.39) 

where I introduced the Larmor frequency defined in the previous section to the 
Hamiltonian. Differentiating Eqs. 9.38 and 9.39 with respect to time, I can separate 
them into two independent differential equations of the second order: 

d 2 S x 
dt 2 
d 2 S y 
dt 2 

with obvious solutions 

S x (t ) = A x sin cOid + B x cos corf 
S y (t ) = Ay sin corf + By cos corf. 

Unknown constant operators A Xt y and B x y are determined by the initial conditions 
for the spin operators and their derivatives: 

1 dS x 

B x = S x ( 0); A x = -—^ = -Sy( 0) 

dt 

5v = 5' v (0): Ay = —rf= SA 0) 

where (0) coincide with the Schrodinger spin operators. Thus, I have 

S x (t) = —S y ( 0) sin corf + 5 X (0) cos corf (9.40) 

S y (t ) = 5 X (0) sin + £y(0) cos corf. (9.41) 

All that is left to do is to compute the expectation values of the Schrodinger spin 
operators in the initial state given by Eqs. 9.32 and 9.33. However, I do not have to 
repeat these calculations as we can just read them off Eq. 9.36 at t = 0. This yields 


= -u> 2 l S x 

= —CO^Sy 


dSx 

dt 

dSy 

dt 


~h WL 


-COl 


[&,&] 

[Mj 


rf \ fi fi fi 

[S x (t)j = - sin 6 cos cp cos corf + — sin 6 sin cp sin corf = — sin 9 cos (corf — cp) 

rf \ fi fi fi 

\S y (t)j = - sin 6 cos cp sin corf — — sin 6 sin cp cos corf = — sin 6 sin (corf — cp) 


in complete agreement with the results obtained from the Schrodinger picture. 
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9.4 Spin of a Two-Electron System 
9.4.1 Space of Two-Particle States 

I will complete the discussion of the spin by considering a system of two electrons. 
The goal of this exercise is to figure out if it makes sense to talk about a total spin of 
this system understood as a some kind of the sum of two individual spins Si +£ 2 . In 
classical physics that would have been a trivial question—of course, we can define 
the total angular momentum of several particles—just add them up remembering 
that they are vectors. You can even derive a total angular momentum conservation 
law valid in the absence of external torques, just like you can derive a total linear 
momentum conservation law if the system of particles is not exposed to external 
forces. In quantum mechanics, when individual spins are presented by operators 
acting in different spaces containing spin states of each particle, the answer to this 
question is more complex. It is still affirmative: yes, it is possible to define the total 
spin of a system of two (or more particles) by introducing a new operator, which 
can be formally defined as 


^(tp) 


s (1) + s (2) , 


(9.42) 


where the upper index is an abbreviation of “two-particle.” However, so far Eq. 9.42 
is a purely formal expression, in which even the meaning of the sign “+” is not 

~ ftp) 

clear. What I need to do now is to figure out the properties of S and their relation 

to the properties of S^ l) and S^, which is not a trivial task. 

Operators are defined by their action on vectors, and, since vectors live in a 
certain vector space, the first step in defining an operator is to understand the space 
where vectors, on which the operator acts, live. Operators Si and S 2 operate on 
vectors that live in different and unrelated spaces: one acts on spin states of one 
particle and the other on the states of a completely different particle. I can, however, 
combine these spaces to form a new extended space, which would include spin states 
of both particles. To define such a space, all what I need is to define a basis in it, 
and then any other vector can be presented as a linear combination of the vectors of 
basis. The space containing spin states of each individual particle is defined by two 
basis vectors for each particle. These states are eigenvectors of operators Sz and Sz 
(obviously defined in the same coordinate system) and can be depicted symbolically 
in a few equivalent ways discussed in previous sections. Here I will use spin-up 
and spin-down notations indicated by the vertical arrows | f 1,2) or 1,2)* where 
subindexes 1 and 2 simply indicate a particle whose states these kets represent. 
In a system of two particles, there exist four different combinations of their spin 
states: both spins up, both spins down, the first spins up, the second spins down, 
and vice versa. You can create a notation for these states either by putting two state 
signifiers inside a single ket, like that | jT, f 2 ), or by sticking together two kets like 
this: 1 ) |f 2 )- The difference between the two notations is superfluous, and either 
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one can be used freely, while the second notation with two separate kets is slightly 
more convenient when one needs to write down matrix elements corresponding to 
operators acting on different particles. Thus, I will present the four basis vectors in 
a new four-dimensional space containing states of a two-spin system as 

|1) EE Iti) It 2>, |2> SS It,) It 2 > , |3> SS 111) It 2 ) 14) = It,) It 2 ) • (9.43) 

Conversion from kets to bras occurs following all the standard rules of Hermitian 
conjugation applied to the states of both particles. 

A larger space formed from two smaller spaces in the described manner is 
called in mathematics a tensor product of spaces. It has all the standard algebraic 
properties of a linear vector space discussed in Sect. 2.2, and I only need to add the 
distributive properties involving vectors belonging to different components of the 
tensor product: 


(k,) + k2))|wi) = ki)|w,) + k 2 )|wi) 

ki) (ki) + |v 2 » = ki) k,) + ki) k 2 ) • (9-44) 

The inner product in the tensor space is defined as 

(ki | (v\ |) (\e 2 ) |v 2 )) = (e, | e 2 ) ki \ v 2 ) , (9.45) 

and it is quite obvious that this definition preserves the main property of the inner 
product, namely, that |a) = (a |/3)*. In the case of the two-spin system, you can, 
for instance, find using the notation from Eq. 9.43: 

<i| i) = <tilti)(t 2 | t 2 ) = i 

where it is presumed that vectors | f 1 , 2 ) are normalized. You can also find that the 
inner products involving different vectors from the basis vanish such as 

<1| 2) = <t,| ti> <t 2 | | 2 )=0 
<l|3) = (tiUi)<t 2 |t 2 )=0. 

In reality you have already encountered the tensor product of spaces earlier in 
this book, even though I never used the name. One example was the construction 
of states of a three-dimensional harmonic oscillator from the states of the one¬ 
dimensional oscillators. 

To illustrate calculations of the inner product between vectors belonging to such 
a tensor product space, consider the following example. 

Example 24 (Working with Vectors from a Tensor Product Space) Compute the 
norms of the following vectors as well as the inner product (/3 |a): 
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|a) = (3i|ti)+4|| 1 ))(2|t2)-/||2)) 
Ij8) = (2|ti)*i||i))(2|t2)-3|| 2 )). 


Solution 

Since all the vectors in adjacent parenthetic expressions are kets and belong to 
different spaces, it is clear that I am dealing here with the tensor product of two 
spaces. Distribution properties, expressed by Eq. 9.44, allow me to convert these 
expressions into 


\ a ) — 6z |ti) 11 2> - 4/ 111) || 2 ) + 3 Iti) II 2 ) + 8 111) |f2) 

\P) = 4 iti> it 2 ) + 3« iii) i; 2 ) - 6 iti> i; 2 > - 2i\u) it 2 ) • 


Note that the order in which vectors belonging to different spaces are stacked 
together is completely irrelevant. Using the normalized and orthogonal basis 
introduced in Eq. 9.43, 1 can rewrite this expression as 


\a) = 6i |1) — 4Z |4) + 3 |2) + 8 |3) 
\P) = 4 |1) + 3/14) — 612> — 2/ |3>. 


Now I can compute the norms and the inner product following the standard 
procedure, which yields 


lid'll = V36+ 16 + 9 + 64 = VL25 


ll^ll = V16 + 9 + 36 + 4 = V65 
(/} |a> = 4 x 6i — 4 i (-3 i) - 3 x 6 + 8 x (2 i) = -30 + 40/. 


Finally, I need to introduce the rule describing how spin operators act on the 
vectors in the tensor product space. The rule is actually very simple: the operators 
affect only the states of their own particles. To illustrate this rule, consider the 
following example. 

Example 25 (Operator Action in Tensor Spaces) For the state la) from the previous 
example, compute 


(a) 




(b) 
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Solution 

(a) 

+ Sf) (6/ Iti) Iti) - 4« 111) |; 2 ) + 3 It!) || 2 ) + 8 It!) It2» = 
6 &P Iti) Ita) + 6/ |fi) S® |t 2 ) - 4/S« |ti) lt 2 ) - 4« Iti) |t 2 ) + 
3^ !) Iti) lt 2 ) + 3 |ti) |t 2 ) + 8Sp |ti) |t 2 ) + 8 |ti) ^ 2) lt 2 ) = 

3/ft |ti) lt 2 ) + 3/ft |ti) |t 2 ) + 2/ft |ti) lt 2 ) + 2/ft |ti) lt 2 ) + 

3ft 

y Iti) lt 2 ) - y Iti) lt 2 ) - 4ft Iti) lt 2 ) + 4ft Iti) lt 2 ) = 

6ift Iti) lt 2 ) + 4/ft |ti) lt 2 ) 

(b) 

+ S®) (6/ |ti) lt 2 ) - 4/ |ti) lt 2 ) + 3 |ti) lt 2 ) + 8 |ti) |t 2 » = 
6 iS ( l ] |ti) lt 2 ) + 6/ Iti) Sf |t 2 ) - |ti) lt 2 ) - 4/ Iti) Sf |t 2 ) + 
3^ Iti) lt 2 ) + 3 Iti) |t 2 ) + 85^ |ti) |t 2 ) + 8 Iti) Sf |t 2 ) = 
-4/ft |ti) |t 2 ) - 4/ft |ti) lt 2 ) + 3ft Iti) lt 2 ) + 8ft Iti) lt 2 ) = 
lift |ti) |tz)-4/ft (Iti) lt 2 ) + Iti) Itz)) 


9.4.2 Operator of the Total Spin 


The concept of the tensor product gives an exact mathematical meaning to Eq. 
9.42 and the “plus” sign in it as illustrated by the previous example. Indeed, if 
each operator appearing on the right-hand side of this equation is defined to act 
in a common space of two-particle states, then the plus sign generates the regular 
operator sum as defined in earlier chapters of this book. 

Now I can tackle the main question: what are the eigenvalues and eigenvectors 
of the components of the total spin operator defined by Eq. 9.42, and of its square 




+ 



? When discussing any systems of operators, the first 


question you must be concerned with is the commutation relations between these 


operators. The first commutator that needs to be dealt with is between operators 
*§ (1) and S^ 2 \ and it is quite obvious that any two components of these operators 


commute, i.e., 
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for all ij taking values x,y, and z. Indeed, since sf } only affects the states of 
the particle 1, and S] acts only on the states of particle 2, an order in which 
these operators are applied is irrelevant. Now it is quite easy to establish that all 

~tp 

commutation relations for the components of operator S and its square are exactly 
the same as for any operator of angular momentum. This justifies the claim that there 
exists a system of vectors, | S,M), which are common eigenvectors of one of the 

~ tp ~ / *tp\2 

components of S , usually chosen to be Sf, and of the operator ( S J characterized 
by two numbers M and S such that 

Sf | S,M) = fiM \S,M) 

(s‘ p y I S,M) = ti 2 S{S + 1) I S,M) ■ 

It also can be claimed that \M\ < S and that these numbers take integer of half¬ 
integer values. What is missing at this point is the information about the actual 
values that S can take and its relation to eigenvalues of the spin operators of the 
individual particles. Also, one would like to know about the connection between 

eigenvectors of Sf and and their single-particle counterparts. To answer 

these questions, I am going to generate matrix representations of operators Sf and 

9 using the basis vectors defined in Eq. 9.43. 1 will start with operator Sf. The 
application of this operator to the basis vectors yields (see examples above) 


^Iti)lt2)=*lti)lt2) (9.46) 

Sf\h)\h)=0 (9.47) 

Sf ||i)|t 2 > =0 (9.48) 

Sf\U)\U) = -ti\U)\U). (9.49) 


These results indicate that the basis vectors defined in Eq. 9.43 are also eigenvectors 
of the operator Sf with eigenvalues ±ti and a double-degenerate eigenvalue 0. Thus, 
the matrix of this operator in this basis is a diagonal 4x4 matrix: 


S? 


"1 0 0 0 " 

0 00 0 

0 00 0 

_0 0 0 - 1 _ 


where I have positioned the matrix elements in accord with the numeration of 
eigenvectors introduced in Eq. 9.43. For instance, the right-hand side of Eq. 9.46, 
where operator Sf acts on the first of the basis vectors, represents the first column of 
the matrix, which contains a single non-zero element, the right-hand side of Eq. 9.47 
yields the second column, where all elements are zeroes, and so on and so forth. 
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Operator requires more work. First, let me rewrite it in terms of the 


particle’s operators S W and S (2) : 


(s ‘ P ) 2 = (s (1) + S (2) ) 2 = (s (1) ) 2 + (s (2) ) 2 + 2S (1) • S (2) = 

(s (1) ) 2 + (s (2) ) 2 + 2 = 

(s (1) ) 2 + (s (2) ) 2 + 2S< 1) sf ) + 

's^ +5^ S® + S&> S+ } - S® - S&- 

2 -+-— 

2 2 2 i 2 i 

(s (1) )~ + (s (2) ) 2 + 2Sf ) Sf ) + 

where I replaced the x- and y-components of the spin operator by ladder operators 
defined in Eqs. 3.59 and 3.60 adapted for spin operators. The last expression is 

perfectly suited for generating the matrix of (s^ . Applying this operator to each 
of the basis vectors, I can again simply read out the columns of this matrix: 


(s’ , ) 2 ID = 

(s (1) ) 2 + ($ (2) ) 2 + 2Sf ) 5f ) + 


lti> lt 2 ) = 


(9.50) 


\h 2 Iti) \h) + l* 2 Iti> 11 2 ) + \f> 2 Iti) 11 2 ) = 2h 2 |ti) It2> = 2^ 2 |1). 


The ladder operators do not contribute to the final result because the raising operator 
applied to the spin-up vector yields zero. All other terms in this expression follow 
from the standard properties of the spin operators. Continue 


(s tP ) 2 12> = 

(S (1) ) 2 + (S <2) ) 2 + 2Sf ) Sf ) + 


Itl>lt2> = 


rfc 2 lti>lt2> 


^ 2 |tl)lt2)-^ 2 |tl)lt2>+^ 2 |tl)|t2) = 

^ 2 |ti)lt2>+^ 2 |ti)lt2)=^ 2 |2)+^ 2 |3) 


(9.51) 


where the ladder operators in term S^)S^ are responsible for the non-zero 
contribution and where the spin of each particle becomes upside down. And again 
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(S *) 2 |3> = 

(S (1) ) 2 + (S <2) ) 2 + 2Sf ) Sf ) + 


lll>ltz> = 


\h 2 111) It 2 > + \h 2 111) Itz) - \fi 2 111) Itz) + ft 2 111 ) llz) = (9.52) 

Iti) llz) +h 2 Hi) Itz) = |3) + |2) 

where the inversion of the spins in the last term is due to operators S^S@\ Finally 

(S ?P ) 2 |4) = 


(s (1) ) 2 + (s (2) ) 2 + 2sp ) sf ) + 


Hi) llz) = 


(9.53) 


\ti 2 1;,) i; 2 ) + h 2 1;,) i; 2 ) + l -h 2 u,) i; 2 > = m 2 no i; 2 ) = m 2 |4>. 


Reading out columns 1 through 4 from Eqs. 9.50 to 9.53 correspondingly, I generate 
the desired matrix: 



"2 0 0 0 " 
0 110 
0 110 
_0 0 0 2 _ 


What is left now is to find its eigenvalues and eigenvectors, i.e., to solve the 
eigenvalue problem: 


"2 0 0 0" 


”<2i 


”<2i" 

0 110 


<22 

= A fi 2 

<22 

0 110 


<23 


<23 

_0 0 0 2_ 


_<24_ 


_<24_ 


Two eigenvalues can be found just by looking at Eqs. 9.50 and 9.53, which indicate 
that vectors |1) and |4) are eigenvectors of this matrix with eigenvalues 2 fi 2 . This 
circumstance is reflected in the structure of the matrix, where the first and fourth 
rows as well as the first and fourth columns contain single non-zero elements. Such 
matrices are known as block-diagonal, and what makes them special is that each 
block can be considered independently of the others and treated accordingly. For 
instance, the equations for a\ and a 4 will not contain any other coefficients, while 
equations for elements <22 and <23 will only contain these two elements. Since I 
already know that solutions with a\ = 1, < 22 , 3,4 = 0 and <24 = 1 , < 21 , 2,3 = 0 are 
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eigenvectors corresponding to A = 2fi 2 ,1 only need to deal with the remaining two 
coefficients a2 and <23 satisfying equations 


^2 + ^3 — Aa2 
^2 + ^3 — A03. 


It immediately follows from this system that either <22 = <23 or A = 0. In the former 
case, I have 


A = 2, 


while the latter one gives me 


<22 = —(23. 

Thus, I end up once again with eigenvalue 2 fi 2 , but now it belongs to the eigenvector 

-h(|2) + |3» = -h(lfi) U.2) +111) It2» 

where I set 02 = ^3 = 1 / \/2 to make this vector normalized. I also got a new 
eigenvalue equal to zero with eigenvector 

-k(|2)-|3» = -k(|t,)|t 2 )-|ti>lt2». 

Recalling that eigenvalues of the must have the form fi 2 S(S + 1), I can 

immediately deduce that eigenvalue 2fi 2 corresponds to S' = 1, while eigenvalue 
zero, obviously, corresponds to S = 0. 

It is time to put all these results together. Here is what I have: a triple degenerate 
eigenvalue characterized by spin S = 1 and three eigenvectors 


|U> = Itl)lt2> 

11 , 0 ) = -2= dti) ir 2 ) + iro it 2 » (9.54) 

V2 

= IliHW 


and a single non-degenerate eigenvalue corresponding to S = 0 with eigenvector 
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attached to it. Notations used for these eigenvectors follow the traditional scheme 
| S, M) and reflect the facts that all three eigenvectors in Eq. 9.54 are simultaneously 
eigenvectors of operator Sf with corresponding quantum numbers M = 1, M = 0, 
and M = — 1, while a single eigenvector in Eq. 9.55 is also an eigenvector of Sf 
corresponding to M = 0. You might want to pay attention to the fact that both 
superposition eigenvectors |2,0) and |0,0) are linear combinations of the eigen¬ 
vectors of Sf established in Eqs. 9.47 and 9.48 belonging to a double-degenerate 
eigenvalue 0 of Sf, which reflects the general notion that linear combinations of 
degenerate eigenvectors are also eigenvectors belonging to the same eigenvalue. The 
particular combinations appearing in Eqs. 9.54 and 9.55 ensure that these vectors are 

simultaneously eigenvectors of the operator ^5^ . The results presented in these 
equations also reflect the general property of the angular momentum operators: the 
value of quantum number S determines the maximum and minimum allowed values 
of the second quantum number M and, respectively, the total number 2S + 1 of 

/ *tp\ ^ 

eigenvectors belonging to the given eigenvalue of (S' ) . Indeed, for S = 1, we 
have three vectors with M ranging from —1 to 1, while for S = 0, there exists a 
single vector with M = 0. This situation is often described by saying that the system 
of two-spin 1/2 particles can be in two states characterized by the total spin equal 
to one or zero. The former is called a triplet state reflecting the existence of three 
distinct states with the same S and different magnetic numbers M, and the latter is 
called a singlet for obvious enough reasons. People also often say that in the triplet 
state, the spins of the particles are parallel to each other, while in the singlet state, 
they are antiparallel, but this is highly misleading. Even leaving aside the obvious 
quantum mechanical fact that the direction of spin in quantum mechanics is not 
defined because only one component of the vector can have a definite value in a 
given state, parallel or antiparallel can refer only to the sign of the z-component of 
the spin determined by the value of M. As we have just seen, this number can be 
equal to zero, reflecting the “antiparallel” orientation of the particle’s spins, when 
the particles are either in the S = 1 or S = 0 state. Therefore, more accurate 
verbal description of the situation (if you really need one) may sound like this: in 
the triplet spin states, the particle’s spins can be either parallel or antiparallel, while 
in the singlet state, they can only be antiparallel. 

To complete this discussion, let me direct your attention to another interesting 
difference between triplet and singlet states. The former are symmetric with respect 
to the exchange of the particles, while the latter are antisymmetric. What it means 
is that if you replace particle indexes 1 and 2 in Eqs. 9.54 and 9.55 (exchange 
the particles one and two), the states described by the former equation do not 
change, while the singlet state described by the latter equation changes its sign. 
The operation of the particle’s exchange reflects the classical idea that you can 
somehow distinguish between the particles marking them as one and two and then 
swap them by placing particle one in the state of particle two and vice versa. In 
quantum mechanics two electrons are not really distinguishable, and, therefore, the 
swapping operation shouldn’t change the properties of the system. This topic will be 
discussed in much more detail in Chap. 11 devoted to quantum mechanics of many 
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identical particles. Here I just want to mention, giving you a brief preview of what 
is coming, that the symmetry and antisymmetry of the spin states of the two-particle 
system are a reflection of quantum indistinguishability of electrons. 


9.5 Operator of Total Angular Momentum 

9.5.1 Combining Orbital and Spin Degrees of Freedom 


When discussing the model of spin 1 /2 or addition of two such spins, I intentionally 
ignored the fact that the spin is “attached” to a particle, which can be involved 
in all kinds of crazy things such as being a part of an atom or rushing through a 
piece of metal delivering an electron current. At the same time, such phenomena as 
resonant tunneling or hydrogen atom in the previous chapters were treated with utter 
ignorance of the fact that in addition to “regular” observables, such as position or 
momentum, electrons also carry around their spin, which is as unalienable as their 
mass or charge. Now the time has come to design a formalism allowing to treat spin 
and orbital properties 2 of the electrons (and other particles with spin) together. 

First of all, one needs to recognize that the spinors and orbital vectors are 
completely different animals and inhabit different habitats. For instance, while you 
can represent eigenvectors of momentum and angular momentum in the same, say, 
position representation or express them in terms of each other, it is impossible to 
construct a position representation for the eigenvectors of the spin operators or 
present momentum eigenvectors as a linear combination of spinors. Accordingly, 
operators acting on orbital vectors do not affect spinors, and spin operators are 
indifferent to vectors representing orbital states. One of the trivial consequences 
of this is, of course, that orbital and spin operators always commute. Giving these 
statements a bit of a thought, you can notice a certain similarity with the just 
discussed two-spin problem, where we also had to deal with vectors belonging to 
two unrelated spaces and being acted upon only by their “native” operators. That 
situation was handled by combining spinors representing spin states of different 
particles into a common space formed as a tensor product of the spaces of each 
individual spin. Similarly, spin and orbital spaces of a single particle can also be 
combined into a tensor product space by stacking together all combinations of 
the basis vectors from both spaces. Assuming that the orbital space is described 


by some discrete basis 


(1) (2) 
<T k \qm. 




based on a set of mutually consistent 
observables, a typical basis vector in a compound tensor product space can be made 
to look something like this: 




7 ( 2 ) 




( Nmax ) 


)k ,-)> 


(9.56) 


2 By orbital properties I understand all those properties of the particle that can be described using 
quantum states related to position or momentum operators or a combination thereof. In what 
follows I will call these states and vectors representing them orbital states or orbital vectors. 
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where \m Si ) is a basis spinor. Since there are only two of those, the dimension of 
the combined space is two times the dimensionality of the orbital space. Indeed, 


attaching the spin state to each orbital basis vector 
generating two new basis vectors: 


( 1 ) ( 2 ) 
Tk ’Vn ’ 


■■■<,?• 


A 


I am 


o (1) o (2) 




(Nmax) 


)|l/2> 


and 


<tk .«».•••?» 


) }|-l/2>, 


or, if you prefer, 


and 

Sometimes the indicator of a spin state is put inside a single ket or bra vector together 
with the signifiers of all other observables: 


<fk >r m >---q y p 


\m Si ) 


( 2 ) . . . yy(N maX ) 

k ’ ’ *ip 



(9.57) 


but this notation hides the critical difference between the spin and orbital observ¬ 
ables and makes some calculations less intuitive, so I would prefer using the notation 
of Eq. 9.56 most of the time. Nevertheless, sometimes it might be appropriate to 
use the simplified notation of Eq. 9.57, and if you notice me doing it, do not 
start throwing stones—this is just a notation, chosen based on convenience and a 
moment’s expedience. 

An arbitrary vector \x) residing in the tensor product space can be presented as 


I/) ^ 1 a km—p; t 

km,—p 


lt> + 


^ ^ CLkm—p; 4 
km,—p 


77 ( 2 ) fl(Nmax) 

Vk ’Vm ’ % 


)u> 


(9.58) 


Expansion coefficients a km ... p -^ now define the probability \a km ... p ^\ 2 that the 
measurement of the mutually consistent observables including a component of 
the spin will yield values k.m- • •p for regular observables and ti/2 for the spin’s 
component. The set of coefficients a km ... p ^ defines the probability \a km ... p -^\ 2 that 
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the observation will produce the same values of all the orbital observables and value 
—ti/2 for the spin. The sum of probabilities 

Pkm—p — | 

yields the probability to observe the given values of the observables provided that 
the spin is not measured, while the sums 

Pi — ^ 

km,—p 


or 


Pi ^ ' \^km —| 
km,—p 

generate probabilities of getting values of the spin component fi/2 or —fi/2, 
respectively, regardless of the values of other observables. Finally, the normalization 
condition for the expansion coefficients must now include the summation over all 
available variables: 


^ ^ ^ | “1“ J !• (9-59) 

km,—p 

Equations 9.56-9.58 are written under the assumption that the basis in the 
orbital space is discrete. However, they can be easily adapted to representations 
in a continuous basis by replacing all the sums with integrals and probabilities 
with corresponding probability densities. For instance, in the basis of the position 
eigenvectors |r), Eqs. 9.56 and 9.58 become | r) \m s ) and 

I/) = J d 3 n/r f (r) | r) |t> + J ^r^r) | r) ||}. (9.60) 


I Vf m s (f) 1 2 now gives the position probability density for the corresponding spin state 
| m s ), |V^t( r ) 1 2 + | ( r ) | 2 yields the same, but when the spin state is not important, 

and f d 3 r\\l/ ms ( r )\ 2 generates the probability of finding the particle in the spin state 
| m s ). The normalization Eq. 9.59 now becomes 


J d 3 r \ff(r)\ 2 + |VM»| 2 ] = 


(9.61) 


One can generate particular representations for the generic vectors \x) by 
choosing specific bases for the orbital and spinor components of the states. One 
of the most popular choices is to use the position representation for the orbital 
vectors and eigenvectors of operator S z for the spinor component. The respective 
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representation is generated by premultiplying 
yields 


b y H which 


t a m „e> (r) 


(r 


rP cP • • • Q^ max >\ 
Vk ’9 m ’ c lp / 


and by replacing |m s .) with a corresponding two-component column 


for the 


spin-up (or +1/2) state and 


for the spin-down (or —1/2) state. Since the 


coordinate representation for the orbital states is almost always used in conjunction 
with the representation of spinors in the basis of the eigenvectors of S z operator, I 
will call this form the coordinate-spinor representation. Then the combined spin- 
orbital state takes the form 


J2)... _ow) (r) 

Hk ’Hm > Hp 


1 

0 


or 


fjx) +)../««) ( r ) 

<ik Am ,'“q p 



depending on the chosen spin state. The generic state vector represented by Eq. 9.58 
in this representation becomes (I will keep the same notation for the abstract vector 
and its coordinate-spinor representation to avoid introducing new symbols, when it 
is not really necessary and should not cause any confusion) 


where 


\x) = X CLkm—p; fV^ 1 ) q@) ...qWmax) (r) 

km,“‘p 



X! akm "-P’^ q y\q%\---q ( p max) ^ 

km,—p 



^t( r,t) 


1 

0 




'i’l(r.t) 


'I'tCr.r) = X Q , km'“p\\ if) q@) ...qWmax) (?) 

km,—p 

^( r >0 = X '^iNtnax) (k) 

km,--p 


(9.62) 


(9.63) 
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are the orbital wave functions corresponding to spin-up and spin-down states 
correspondingly. These functions appear in Eq. 9.63 as linear combinations of 
the initial basis vectors transformed in their position representations. Obviously, 
t) and T^(r, t) in these expressions are the same functions, which appear 
in Eq. 9.60 presenting expansion of an abstract vector \x) in the basis of position 
eigenvectors. 

Any combination of orbital and spin operators act on vectors defined by Eq. 9.58 
or 9.62 following a simple rule: orbital operators act on orbital component of the 
vector, and spin operators affect only its spin component. To illustrate this point, 
consider the following example. 

Example 26 (Using Operators of Orbital and Spin Angular Momentum.) Consider 
the following vector representing a state of an electron in a hydrogen atom: 

l«> = \ 12 ,1,-1) It) + \ 1 1 , 0 , 0 ) |t) - \ | 2 , 0 , 0 ) |t) + -E | 2 , 1, 1) |t), 

where the orbital portion of the state follows the standard notation | n, l, m). Compute 
the following expressions: 

1. (a\H\a), where H is the Hamiltonian of a hydrogen atom, Eq. 8 . 6 . 

2. (L+S- +L-S-2) |a). 

3- + Sf) |o;). 

4. Write down vector |a) in the coordinate-spinor representation. 

Solution 


1 . I begin by computing 

Hl«> = - jf P. l.-D It) - j£l U.0.0) It) + ~ 12,0,0) If) 

At' 2 ' 


where —E\ is the hydrogen ground state energy. Now I find 


(a\H\a) 


E\ E\ E\ E\ E\ 
~9~~9~26~V2~ _ T’ 


where I took into account that all terms in the expression above remain mutually 
orthogonal, so that all cross-product terms in the inner product vanish. The spin 
components of the state are not affected by the Hamiltonian because it does not 
contain any spin operators. 
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2 . 


2 1 

(L+S- +L-S+J | a) = -V2h 2 |2,1,0) It) + -^V2/* 2 l 2 > 1.0) It) = 



it> + m), 


where I applied orbital and spin ladder operators separately to corre¬ 
sponding orbital and spin portions of the vectors using correspondingly 
Eqs. 3.75, 3.76, 5.104, and 5.106. In particular I found that 


M-|2,1,0)||) =L + |2,1,0)5_||) =0 


as well as that 


L-S+ |2, 1, 0) If) = L_ |2, 1, 0) S+ |t) = 0. 


3. 


(L z + S z ) |a) = -h 2 - |2,1,-1) It) + \\ |2,1,-1) It) 


" 11.0,0) |t) - |2,0,0> |t) + |2,1,1) |t) - 2=! |2,1,1) |t) = 


“ |2,1,-1) |t) - \ |1.0,0) |t) - 1 12,0 , 0) |t) + -h |2,1 , 1) |t) 


4. A coordinate-spinor representation of vector \a) looks like this: 


f/? 2 i (r)Ff 1 (0,<p)-jj^R 2i} (r) 


If 4^ (r, /) and 4'j(r, t) can be written down as 

4^0% 0 = ai(t)ir(r,t); 4';(r,/) = a 2 (t)ty(r , t ), 

Eq. 9.62 becomes 


I/) = f(r,t) 


a\(t) 

a 2 (t). 


(9.64) 


(9.65) 


resulting in the separation of spin and orbital components of the state. The spin and 
orbital properties of the particle in such a state are completely independent of each 
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other, and changing one of them wouldn’t affect the other. In a more generic case, 
when T^(r) and T^(r) are two different functions, the orbital state of the particle 
depends on its spin state and vice versa. This interdependence is called “spin-orbit 
coupling” and is responsible for many important phenomena. Some of them are old, 
known for a century, while others have been discovered only recently. For instance, 
spin-orbit interaction is responsible for the fine structure of atomic spectra (an old 
phenomenon known from the earlier days of quantum mechanics), but it also gave 
birth to the entire new “hot” research area in contemporary semiconductor physics 
known as spintronics. Researchers working in this field seek to control the flow of 
electrons using their spin as a steering wheel and also to control the orientation of 
an electron’s spin by affecting its electric current. I will talk more about spin-orbit 
coupling and its effect on atomic spectra in Chap. 14, but for the spintronics effects, 
you will have to consult a more specialized book. 

While the abstract form of the Schrodinger equation 


ifi 


a lx) 

3 1 


= H\ X ) 


stays the same even when the spin and orbital degrees of freedom are combined, 
its position representation, which is frequently used for practical calculations, 
needs to be modified. Indeed, in the representation described by Eq. 9.62, a state 
of a particle is described by two wave functions corresponding to two different 
spin states. Respectively, a single Schrodinger equation becomes a system of two 
equations, whose form depends on the interactions included in the Hamiltonian. 
To find the explicit form of these equations, you will need to convert operator H 
into the combined position-spinor representation. This can be done independently 
for the orbital and spin portions of the Hamiltonian with the result, which can be 
presented in the form 


H -* H ms , m ' (r) = (m s \ H (r) |to') 

where m s , m' s take values 1 or 2 corresponding, respectively, to m s = 1/2 and m s = 
— 1/2. Thus, the Hamiltonian in the presence of the spin becomes a 2 x 2 matrix, 
and its action on the state presented in the form of Eq. 9.62 involves (in addition to 
what it normally does to orbital vectors) the multiplication of a matrix and a spinor. 
In the most trivial case, when the Hamiltonian does not contain any spin operators 
and does not act, therefore, on spin states, this matrix becomes 

Hm s ,m' s ( r ) = {rn s \H (r) \m s ) = H (r) (m s | m') = H (r) 

so that the Schrodinger equations for both wave function components ( r ) and 
T^(r) are identical. In this case the total state of the system is described by the 
vector of the form given by Eq. 9.65, in which the coefficients a\ and <22 of the spinor 
component can be chosen arbitrarily. Physically this means that in the absence of the 
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spin-related terms in the Hamiltonian, the spin state of the particle does not change 
with time and is determined by the initial conditions. 

Now let me consider a less trivial case, when the Hamiltonian includes a stand¬ 
alone spin operator, something like what we dealt with in Sect. 9.3: 

H = H orb + 2 ^-S z . (9.66) 

n 

Here H orb is a spin-independent portion of the Hamiltonian, and the second term, 
as you know, describes the interaction of the spin with uniform magnetic field B 
directed along the Z-axis. In the matrix form, this Hamiltonian becomes 

^m s ,m' s = H or b8 ms , m ' s + [A B B (Az)m s ,m' s ( 9 - 67 ) 

where I used the representation of the spin operators in terms of the corresponding 
Pauli matrices introduced in Eqs. 9.15-9.17. The explicit matrix form of the 
stationary Schrodinger equation becomes 


H orb 0 

'W 

+ \a b B 

"1 0 " 


= E 

'W 

0 H orb _ 


_0 -1_ 

Mr)_ 


Mr)_ 


and translates into two independent equations: 

H orb ^\(r) + = E^^(r) (9.68) 

H orb ^f^(r) - iA B B^^{r) = £T'|(r). (9.69) 

This independence signifies the absence of any spin-orbit coupling in this system: 
the functions ( r ) and ^ ( r ) can be chosen in the form of Eq. 9.64 where \// (r) is a 
solution of the orbital equation H orb xf/(r) = E orb \jr{r). With this, Eqs. 9.68 and 9.69 
can be converted into equations 


ai (E - E orb - ia b B) = 0 
a 2 (E - E orb + fi B B ) = 0, 


yielding two eigenvalues = E orb + ia b B and E® = E orb — \i b B, with two 
respective eigenvectors a\ V) = 1, dp = 0 and dp = 0, ap = 1. Choosing the 
zero level of energy at E orb and disregarding the orbital part of the resulting spinors 

\n (1) ) = f(r) q 

'o' 

1 


|*7 (2) ) = ty(r) 
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which does not affect any of the phenomena associated with the action of the 
magnetic field on electron’s spin, you end up with eigenvalues 

£ (u) = ±h b B 


and eigenvectors 


\n m ) 



o 

i 


identical to those found for a single spin in the magnetic field in Sect. 9.3. 
This example demonstrates that the “pure” spin approach, which ignores orbital 
components of the total state of a particle, is justified as long as the presence of the 
spin does not change its orbital state, i.e., in the absence of the spin-orbit interaction. 


9.5.2 Total Angular Momentum: Eigenvalues and Eigenvectors 

In Example 26 in the preceding section, you learned that working in the tensor 
product of spin and orbital spaces, you can operate with expressions combining 
orbital and spin operators such as L z + S z . This is a z-component of a vector operator 

j = L + S (9.70) 

called the operator of a total angular momentum, which plays an important role in 
the general structure of quantum mechanics as well as in a variety of its applications. 
For instance, this operator is crucial for understanding the energy levels of hydrogen 
atom in the presence of spin-orbit coupling and magnetic field; I will introduce you 
to these topics in Chap. 14. Here my objective is to elucidate the general properties 
of this operator, which appears to be a logical conclusion to the discussion started 
in the previous section. 

I begin by stating that components of vector J obey the same commutation 
relations as those of its constituent vectors L and S. This statement is easy to verify, 
taking into account that orbital and spin operators commute. For instance, you can 
check that 

J X Jy — J y J X = L x Ly — LyL x + S X Sy — SyS X = itlL; + itlSy = itljy (9.71) 

where I canceled terms like L x S y — S y L x = 0. Once the commutation relations for 
the components of J are established, one can immediately claim that all components 
of J commute with operator/ 2 , which can be written down as 

J 2 =L 2 +S 2 +2L- S. 


(9.72) 
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Indeed, the proof of the similar statement for orbital angular momentum carried 
out in Sect. 3.3.2 was based exclusively on the inter-component commutation 
relations and is, therefore, automatically expanded to all operators with the same 
commutation relations. If you go back to Sect. 3.3.4, you will recall that the 
derivation of the eigenvalues of the orbital angular momentum operators carried 
out there also relied exclusively on the commutation relations. Therefore, you can 
immediately claim, without fear of retribution or embarrassment, that operators J 2 
and J z possess a common system of eigenvectors, characterized by two numbers j 
and mj , satisfying inequality —j<mj< j , taking either integers or half-integer 
values, and which generate eigenvalues of these operators according to 


J 2 I j,mj) = fi 2 j(j + 1) | j,mj) 

J z I j,ntj) = hmj I j,mj). 


(9.73) 

(9.74) 


However, it would be wrong for you to think that Eqs. 9.72 and 9.74 are the 
end of the story. While these equations do give you some information about the 
eigenvalues and eigenvectors of J 2 and J z , this information is quite limited and does 
not allow you, for instance, to generate representations of these vectors in any basis 
except of their own or to help you evaluate the results of the application of various 
combinations of orbital and spin angular momentum operators to these states. To 
be able to do all this, you need to answer more rather tough questions such as (a) 
what is a relation between numbers j , mj on the one hand and numbers /, s, m , and 
m s on the other, and (b) how are vectors | j,mj) connected with vectors 1 1, mi) and 
| m s )l Finding answers to these questions requires substantial additional efforts, so 
that Eqs. 9.73 and 9.74 are not the end but just the beginning of the journey. 

And as a first step, I would note an additional property of the operators J 2 and J z , 
which they possess by the virtue of being the sum of orbital and spin operators: 

they both commute with operators L and S . Proof of this statement is quite 

-2 

straightforward and is based on Eq. 9.72 as well as on the fact that both L and 

~2 ~2 

S commute with all their components (well, S for spin 1/2 is proportional to a 
unity matrix and, therefore, commutes with everything). This means that operators 

^ /v ~2 ~2 

J 2 ,J z ,L , and S have a common set of eigenvectors so that numbers j and mj do not 
provide a full description of these vectors. To have these vectors fully characterized, 
one needs to throw number l into the mix replacing | j, ntj) with |j, /, mj) and adding 
equation 


L 2 | j, l, ntj) = fi 2 l(l +1)| j,l, ntj) 


(9.75) 


to Eqs. 9.73 and 9.74. Strictly speaking, I would need to include here a spin number 
s as well, but since I am going to limit this discussion to only spin 1/2 particles, 
this number never changes so that its inclusion would just superfluously increase 
the clumsiness of the notations. 
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A relation between vectors \j,l,mj) and individual eigenvectors of the orbital 
and spin operators can be established by using the latter as a basis in the combined 
spin-orbital space defined in Sect. 9.5.1 as a tensor product of the orbital and spinor 
spaces. Specializing a generic Eq. 9.58 to the particular case, when the basis in the 
orbital space is presented by vectors \l, m), I can write for an arbitrary member |/) 
of the tensor product space: 

l/> = E CJMK)- (9-76) 

l' ,m,m s 

However, when applying this expansion to the particular case of vectors | j, l,mj ), 

-2 

I need to take into account that these vectors are eigenvectors of L , i.e., that they 
must obey Eq. 9.75: 


! E 

V ,m,m s 


-'m,m s 


I'. 


m) \m 


>= E 

l' ,m,m s 


C L 1 


1 1', mj | m s ) = 


h 2 E C m,m / i 1 ' + 0 K> = h 2 l(! + 1) \l™) I m s ) . 

l f ,m,m s 


Because of the orthogonality of the basis vectors | l,m) \m s ), the only way to satisfy 
the equality in the last line is to make sure that l' = l is the only term in the sum. This 
is achieved by setting C l m nis = C l m m &u' and thereby vanquishing the summation 
over l'. In a less formal way, you can argue that for the vector defined by Eq. 9.76 to 

be an eigenvector of L , it cannot be a combination of vectors with different values 
of /. Thus, I can conclude that a representation of | j, /, mj) in the basis of |/, m) \m s ) 
must have the following form: 


\h l mj) = J2 1 1, m) K> (9.77) 

m,m s 


where I also added upper index j and lower index mj to the expansion coefficients 
to make it explicit that the expansion is for eigenvectors of operators J 2 and J z 
characterized by quantum numbers j and mj. 

The task now is to find coefficients which are a particular case of so- 

called Clebsch-Gordan coefficients. 3 To this end I will first apply operator J z to the 
left-hand side of Eq. 9.77 and operator L z + S z to its right-hand side. Using Eq. 9.74 
on the left-hand side and similar properties of orbital and spin angular momentum 
operators on the right-hand side, I obtain 


3 Clebsch-Gordan coefficients allow to present eigenvectors of an operator j\ + J 2 in terms of 
eigenvectors of generic angular momentum operators J\ and J 2 . 
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rtl] | j, l, ntj) = ^ C m,m s mj ( m + m s ) \l «*) K) =>■ 

m,m s 

m J Y C m,m s mj \ 1 ’ m ) K> = Y ( m + m s) \h ™) K> =► 

m,m s m,m s 

Y C m,m s ,mj (mj - rtl - HI,) \l, 111) \ttl s ) = 0 . 

m,m s 


For the equation in the last line to be true, one of two things should happen: either 
mj = m + m s or C l J msmj = 0. This means that the Clebsch-Gordan coefficients 
vanish unless m = mj — m s so that they can be presented as 




C l j _,8 


m,mj—m s . 


Substituting this result into Eq. 9.77, I can eliminate the summation over m and 
obtain a simplified form of this expansion: 


I = Y C m s mj l Z ’ m ) = 

m s 


" 1 / 2 s mj 


Imj -^lt) + C-Vm, 


l,mj+ - j |P 


(9.78) 


where the last line explicitly accounts for the fact that the spin number m s only 
takes two values 1/2 and —1/2. Equation 9.77 contains all the information about 
Clebsch-Gordan coefficients that I could extract from operator J z (which is not that 
much), but hopefully I can learn more from operator/ 2 . 

The idea is the same: apply J 2 to the left-hand side of Eq. 9.78, its reincarnation 

- 2-2 - - 

in the form L -\-S + 2L S to this equation’s right-hand side, and find conditions that 
the two sides of the equation agree. The first step is just a recapitulation of Eq. 9.73: 


J 2 \j, l Mj) = h 2 j(j + 1) | j, l Mj) = 


Kid +1) 


'4j 

"1/2 s ,mj 


1 


'rlj 


l,mj — 2 / lt> + C-l/2,,mj 


l,mj + -)\i) 


(9.79) 


but the second one results in rather long expressions, which couldn’t even fit to a 

- 2-2 - - 

single page. Therefore, I will deal with different terms in L +5 +2 L S separately. 

-2 -2 

First I will do L + S , which is the easiest to handle: 
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D+D 

h 2 l(l + 1 ) 


C 


lj 


c 


lj 

l/2s,mj 


l ’ mj ~2 /It) + C - 1 / 2 s .my 

Imj -%\) + C\ 2smj 


l,mj+ -)id 


l,mj+ 2 ) It) 


+ 


C 


lj 

l/2 s mj 




l,mj + - ) II) 


+ 


(/(/+!) + -) 


dj 

1/2 s mj 


Umj-~-)\\)+cl /2smj 


l,mj+ -) |1) 


(9.80) 

To evaluate the remaining L • S term, I first give it a makeover using ladder operators 
L± and S ±: 


L S 


L X S X + LySy + L Z S: 


L z $z + 2 (^+ + ^-) 2 (^+ + ^-) + 

£ - s - + 5 ( L - s + + i+s -) 


(9.81) 


where I used Eqs. 3.59 and 3.60 for orbital and Eqs. 5.109 and 5.108 for spin opera¬ 
tors. Using the fact that | l,mj — |) 11) an d |/, ntj + ||) are eigenvectors of L z and 

S z with eigenvalues ti(mj — 1/2), fi/2 and fi{mj + 1/2), -fi/2 correspondingly, I 
get for the first term in the last line of Eq. 9.81 : 


L Z S Z 


" 1 / 2 ,,mj 


l, mj -^tt)+C l l l2smj 


l,mj+ -) |D 


2 

2 



C 


IJ 

l/2 s mj 


l,ntj 



C 


i.j 

- 1/2 s mj 


l,mj + -j |D 


(9.82) 


To compute the contribution from L-S+ and L+5_, you need to recall that 5+ 11) = 
0, 5_||) = 0, 5+||) = fi |t), 5_ If) = fi ||) (These formulas originally 
appeared in Sect. 5.2.3, Eqs. 5.104 and 5.106, but I am reposting them here for your 
convenience.) You will also need to go back to Eqs. 3.75 and 3.76 to figure out 
the part related to operators L±. Refreshing this way your memory of the ladder 
operators, you can get 
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L-S+ 


C 


l/2 s ,mj 


Umj- -) |f> + C -\l2 s mj 


l, tnj + -)|P 


fi 2 jl(l + 1) - [mj + ^ (m y - 0c2 |/2t „ 


and 


L+S- 


C 


JJ 

l/2 s ,mj 


Umj - |) lt> + C -U2 s , mj 


l,mj ~ 21 H> 


l,mj+ 2 ) ll> 


(9.83) 


fc 2 y Ki + 1) - (mj + ('«./ - ^ jc}/ 


'1/2 5 ,m/ 


l,rnj + - j II) • 


(9.84) 


Finally, you just need to bring together all Eqs. 9.80-9.84 and apply some simple 
algebra (just group together the like terms) to cross the goal line: 


(l + S 2 + 2L ■ S j 


C 


'7./ 


1/2 s ,mj 


l ’ mj ~ o ) It) +C-i /2s mj 


l,mj+ - )\i) 


C l/2 S mj i^( l + 1 ) + mj + ~ ) + 


1(1 


-l/2 s mj 


h 2 C -l/2 s mj ( /<7 + 1 )~mj + ~^j + 

;/+l)-(m 7 +l) 


If) + 




l m J + -) II) • 


Comparing this against Eq. 9.79 and equating coefficients in front of each of the 

vectors, you will end up with the following system of equations for coefficients 

^7/ 


C: J n m and C /9 m : 


(l(l + 1) -./ (./ + 1) + rny + - j d/2,,my+ 


IKl + 1) - (™y + ^«y - ^ 


i/2 s .my 


= 0 (9.85) 
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+ 1 ) - Jmj + Jmj - ~^jc[ J /2smj + 

^/(/ + 1) -j (j + 1) + - - C l i xjlsmj = 0. (9.86) 

And once again you are looking for non-zero solutions of a homogeneous system of 
linear equations, and once again you need to find zeroes of the determinant formed 
by its coefficients: 


/(/ + 1) — j (j + 1) + mj + yj /(/ + 1) — (mj + (mj — 
y/(/ + 1) — (m/ + (my — ^); /(/ + 1) — j (j + 1) + \ — mj 


= 0 . 


Evaluation of the determinate yields 

^/(/ + 1) — j (j + 1) + - + mj ^ ^/(/ + 1) — j (j + 1) + - — 

/(/ + 1 ) + ^rnj + ^mj — = 


( 


iV 

/(/ + 1) — j O' +!) + -) — /(/+!) — - 


( /+ 2) 

where I used easily verified identity 


nr 


/(/+ 1 ) + i = (/+!) 


Now it is quite easy to find that equation 


H) 


~l2 


0(7+1) 


H) 


is satisfied for 


or 


7(7+1) 


=H)( 




(9.87) 


7(7 + 1) - (1+ 5) (1- 5) • 
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The only physically meaningful solutions of these equations are 

Ji= l +\ (9.88) 

and 

h = l~\ • W) 

(Two other solutions —1 — 3/2 and —l — 1/2 are negative and must be ignored.) The 
obtained result means that for any value of the orbital quantum number /, operator / 2 
has two possible eigenvalues ti 2 j\ (j\ + 1 ) and fi 2 j 2 {ji + 1) with j\ and 72 defined 
above. For each value of 7 , there are 2 7 + 1 values of mj, rrij = — 7 , —j + 1 • • -j — 
1,7 so that the total number of states | j, l,mj) (for a given /) is 2 (/ + 1 / 2 ) + 1 + 
2 ((/ — 1 / 2 ) + 1 ) = 2 ( 2 / + 1 ), which is exactly the same as the number of states 
| l,m) | m s ). One important conclusion from this arithmetic is that orthogonal and 
linearly independent states | j,l,mj) and other orthogonal and independent states 
| l,m) | m s ) represent two alternative bases in the same vector space: vectors of the 
former basis are defined by the states in which the measurement of the total angular 
momentum and its component would yield determinate results, and vectors of the 
latter basis correspond to the states in which orbital and spin momenta separately 
would have definite values. 

Now I can go back to Eqs. 9.85 and 9.86 and find the Clebsch-Gordan coeffi¬ 
cients that establish a connection between vectors \j, l,mj) and vectors \l,m) \m s ), 
signaling the close end of this journey. Substituting the found values for j\ and 72 to 
Eqs. 9.85 and 9.86, 1 find the two sets of the coefficients: 


Wji 

- 1/2 s,mj 


\Jl(l + 1) — m] + -j 


/-A J 1 _ 

^ 1/2 s.mj ~ 


Z+^-rny ;jl 
l+\+mj l/2 °- mj 


(9.90) 


J 2 _ 

°1 /2 S mj ~ 


l +\-mj 
/1(1+1) — m] + ^ 


^- 1/2 s mj 


l+\~mj lj2 

l+\+mj ~ l,2s - m] 


(9.91) 


where I again used Eq. 9.87. As usual, Eqs. 9.85 and 9.86 yield only the ratio of 
the coefficients, and in order to find the coefficients themselves, the normalization 
requirement, complemented by the convention that the Clebsch-Gordan coefficients 
remain real, needs to be invoked. Substituting Eqs. 9.90 and 9.91 into the normal¬ 
ization condition 


C 


- 1/2 s mj 


c 


IJ 

l/2 s mj 


2 

= 1 , 
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I find after some trivial algebra 


r iji _ l l+ I + m J . r iji 

l/2 s mj y 2/ + 1 ’ — 1 / 2 s, m J 


l+\-mj 
21 + 1 


_ l + 5 ~ m J . _ _ 1+ 5+mj 

^l/2 s mj y 2/+1 ’ y 2Z+1 


(9.92) 


Now you just plug Eq. 9.92 into Eq. 9.78 to derive the final expressions for the two 
eigenvectors of operator J 2 characterized by quantum numbers /j and 72 in terms of 
linear combination of the orbital and spin angular momentum eigenvectors: 


| / + 1/2,/, m 7 ) = 


V2/TT 


\// + my + 1/2 

-y// — my + 1/2 


^ _ 2 / 


+ 2 ) 1^) 


(9.93) 


|/-1/2,/, my) 


1 

>/2/ +T 


yf l — my + 1/2 


/, my 



y/TT my + 1/2 


/, my + - 



(9.94) 


It is quite easy to verify that vectors |/+1/2,/, my) and |/—1/2,/, my) are 
normalized and orthogonal, as they shall be. One can interpret this result by saying 
that if an electron is prepared in a state with determinate values of total angular 
momentum fi 2 j( j + 1 ), one of its components, frnij , and total orbital momentum 
ti 2 l(l + 1 ), the values of the corresponding components of its orbital momentum 
/zm and spin /zm^ remain uncertain. An attempt to measure them will produce the 
combination m = mj — 1 / 2 , m s = 1/2 with probabilities 


Pmj- 1 / 2 , 1/2 


I Z+my + 1/2 
2/+1 

/—my+ 1/2 
2/+1 


J — 1 + 1/2 

j = 1 - 1/2 


or combination m = mj + 1 / 2 ,m s = — 1/2 with probabilities 


(9.95) 


= / + 1/2 

Pmy + 1/2,—1/2 - j ; +mj+1/2 

( 2/+1 J ~ 1 


(9.96) 


To help you feel better about these results, let me illustrate the application of 
Eqs. 9.95 and 9.96 by a few examples. 
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Example 27 (Measuring Spin and Orbital Angular Momentums in the State with the 
Definite Value of the Total Angular Momentum) Assume that an electron is in a 
state with a given orbital momentum /, total angular momentum j = l — 1/2, and 
its z-component mj = l — 3/2 and that you have a magic instrument allowing you 
to measure the z-components of electron’s orbital momentum and its spin. What are 
the possible outcomes of such a measurement and their probabilities? 

Solution 

Value mj = l — 3/2 can be obtained in two different ways—when m s = 1/2 and 
m = l — 2 ov m s = —1/2 and m = / — 1. The probability of the first outcome is 
(second line in Eq. 9.95) 


P1-2 , 1/2 


/— (/ — 3/2) + 1/2 
2Z + 1 


2 

21 a - r 


and the probability of the second outcome (second line in Eq. 9.96) is 


Pi- 1-1/2 


/ + (/ — 3/2) + 1/2 
21 A 1 


2 /- 1 
21 a r 


Obviously the sum of the two probabilities is equal to one, and for large values of /, 
the second outcome is significantly more probable. 

Example 28 (More on Measurement of Spin and Orbital Momentums) Let me 
modify the previous example by assuming that the value of the total angular 
momentum is not known, but it is known that the electron can be in either state 
of the total angular momentum with equal probability. How will the answer to the 
previous example change in this case? 

Solution 

Now you have to take into account that both possible outcomes discussed in the 
previous example can come either from the state with j = l A 1 /2 or the state with 
j = l — 1/2. Respectively, the total probability of the outcomes becomes 


_ 1 / — (/ — 3/2) + 1/2 1 / + (/ - 3/2) + 1/2 _ / + 1/2 _ 1 

Pi—2,\/2 - 2 21+ 1 + 2 2Z+ 1 ~~ 2Z+ 1 “ 2 

1Z + (Z — 3/2) + 1/2 1Z— (Z — 3/2) + 1/2 1 

P/-1-1/2 - 2 2Z+ 1 + 2 21+ 1 “ 2' 

Even though, generally speaking, either % or m and cannot be known with 
certainty in the same state, there exist two states in which all three of these quantum 
numbers have definite values. These are the states with the largest mj = l A 1/2 
and smallest mj = —l — 1/2 values of mj , for which one of the Clebsch-Gordan 
coefficients vanishes, while the other one turns to unity, reducing Eq. 9.93 to 


| Z + 1/2, /, / + 1/2) = I /, /) It); |Z+ 1/2, /, -/ - 1/2) = I Z, -/) II) . 
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You can easily understand this fact by noting that mj = / + 1/2 or mj = —l— 1/2 
can be obtained only by a single combination of m and m s :mj = 1+1/2 corresponds 
to the choice m = l and m s = 1/2, while mj = —l— 1/2 can only be generated by 
m = —l, m s = —1/2. 

Equations 9.88 and 9.89 together with Eqs.9.93 and 9.94 provide answers to 
all the questions posed in the beginning of this subsection: you now know the 
relation between total, orbital, and spin angular momentum quantum numbers as 
well as between corresponding eigenvectors. In particular, Eqs. 9.93 and 9.94 allow 
generating any representation for | j,/, m/), using corresponding representations 
for vectors |/, m) and \m s ), as well as define the action of any combination of 
orbital and spin operators on these vectors. To illustrate this point, I will write 
down the coordinate-spinor representation of \l + l/2,l,mj) using Eq. 9.93 and the 
corresponding representations for |/, m) and \m s ): 


|/ + 1/2,/, m 7 ) 


1 

\[2l +~T 


V/ + mj + 1/2*7' 1/2 (0,<p) 
— 1/2F™ 7+1/2 ((9, <p) 


I can now use this to compute, e.g., L+S- \l + 1/2,/, mj). Taking the matrix 
representation for S- from Eq. 5.107 and recalling that any orbital operator in the 
spinor representation is multiplied by a unity matrix, I can rewrite this expression as 


L+S- |/+ 1 / 2 , l,mj) 


fl 

L+ 0 

0 0 

+ mj + 1 / 2 If' 1/2 ( 0 ,^) 

\/ 2 / + 1 

0 L+ 

1 0 

_V/-m y + l/ 2 y ” J+1/2 ( 6 ,<p)_ 


fly/1 + mj +1/2 

L+ 0 

0 

fiy/1 + mj +1/2 

0 

\/2/ + 1 

0 L+ 

y ; m - /_1/2 (e, <p) 

\/2/ + 1 

L+y” y_1/2 (d,<p) 


17 ^ VKl + 1 ) - (mj - 1 / 2 ) («j + 1 / 2 ) 


0 

v mj+ 1/2 


(0,<p) 


h(l + mj + 1 / 2 ) 


ll-mj + 1/2 

0 

/ 2 / + 1 

y ” 7+1/2 (e, 


9.6 Problems 

Section 9.2 


Problem 112 Write down a spinor corresponding to the point on the Bloch sphere 
with coordinates 6 = n/4, (p = 3jt/2. 
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Problem 113 The impossibility of half-integer values of the angular momentum 
for orbital angular momentum operators expressed in terms of coordinate and 
momentum operators can be demonstrated by considering the following example. 
Imagine that there exists a state of the orbital angular momentum with / = 1/2. 
Then in the coordinate representation, these states would be represented by two 
functions f 1 / 2 ( 0 , qj) and /_ 1/2(0, (p) corresponding to the values of the magnetic 
quantum number m = 1/2 and m = —1/2, respectively. These functions must 
obey the following set of equations: 


L+f l/2 (0,<p) = 0; L-f- 1/2 (0,<p) = 0 
L+f- l/2 (d,(p ) =fi/ 2 (0,<p); L-f +l/2 (0,<p) =f-i /2 (0,(p). 

Using the coordinate representation of the ladder operators, show that these 
equations are mutually inconsistent. 

Problem 114 An electron is in spin state described by (non-normalized) spinor: 


\x) 


2i -3 
4 


1. Normalize this spinor. 

2. If you measure the z-component of the spin, what are the probabilities of various 
outcomes? 

3. What is the expectation value of the z-component of the spin in this state? 

4. Answer the same questions for x- and y-components. 

Problem 115 

1. Consider a spin in state 


1 

_ 0 _ ' 

You measure the component of the spin in the direction of the unit vector 
n characterized by angles 9, <p of the spherical coordinate system. What is a 
probability of obtaining value — ti/2 as an outcome of this measurement? 

2. Imagine that you conduct two measurements in a quick succession: first you 
carry out the measurement described in the previous part of the problem, and 
right after that, you measure the y-component of the spin. Find the probability 
of getting ti/2 as an outcome of the last measurement. (Hint: Do not forget to 
consider all possible paths that could lead to this outcome.) 

Problem 116 Consider a particle with spin 1/2 in a state in which a component of 
the spin in a specified direction is equal to ti/2. Choose a coordinate system with 
the Z-axis along this direction and some arbitrary positions for X- and T-axes in 
the perpendicular plane. Now imagine that you measure a component of the spin in 
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a direction making angle 30° with the Z-axis and lying in the XZ plane. Find the 
probabilities of the various outcomes of this measurement. 


Section 9.3 

Problem 117 Derive the expression for the expectation value of the y-component 
of the spin in the state specified by Eq. 9.34. 

Problem 118 Consider a spin in the initial state characterized by angles 9 = 7 r/6 
and cp = tt/ 3 of the Bloch sphere. At time t = 0, the magnetic field B directed 
along the polar axes of the spherical coordinate system is turned on and remains on 
for t = : r/ (2 co L ) seconds. After the field is off, an experimentalist measures the 
z-component of the spin. What is the probability that the measurement yields ti/ 2? 
—ti/ 2? Answer the same questions if it is the v-component of the spin that is being 
measured. 

Problem 119 In the last problem to Chap. 5, you found matrices S x , S y , and S z 
for a particle with spin 3/2. Assume that an interaction of this particle with its 
surrounding is described by Hamiltonian: 



1. Find the stationary states of this Hamiltonian. 

2. Assuming that the initial state of the particle is given by a generic spinor of the 
form 

rn 


_ 0 _ 

find the spin state of the particle at time t. 

3. Calculate the time-dependent expectation values of all three components of the 
spin operator. 

Problem 120 Consider a spin 1/2 particle in a time-dependent magnetic field, 
which rotates with angular velocity Q in the X-Y plane: 

B = iB 0 cos £2t + jBo sin Qt, 

where i and j are unit vectors in the directions of X and Y coordinate axes, 
respectively. Derive the Heisenberg equations for the spin operators and solve them. 
Note, since the Hamiltonian of this system is time-dependent, you cannot claim the 
same form for the Hamiltonian in Schrodinger and Heisenberg pictures based upon 
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the notion that the time-evolution operator U commutes with the Hamiltonian (it 
does not because it does not have the form of exp (—iHt/fi'j , which is only valid for 
time-independent Hamiltonians). Nevertheless, since the time-dependent factor in 
the Hamiltonian does not contain operators, you can still show that the Heisenberg 
form of the Hamiltonian, which in the Schrodinger picture has the form 

H = 2—S ■ B, 
fl 

has exactly the same form in the Heisenberg picture if the Schrodinger spin operator 
is replaced with its time-dependent Heisenberg operator. 

1. Convince yourself that this is, indeed, the case. 

2. Derive the Heisenberg equations for all three components of the spin operators. 

3. Solve these equations and find the time dependence of the spin operators. (Hint: 
You might want to introduce new time-dependent operators defined as 

P = S x cos Qt + S y sin Qt 

Q = S y cos Qt — S x sin Qt 

and derive equations for them.) 


Section 9.4 

Problem 121 Normalize the following vector belonging to the tensor product of 
two spaces: 


W) = 2 i 

assuming that vectors 


S") (]<•“) - 3/14 2) » + (21«!") - 314"» |4 2> ) • 

j and ef \j are normalized and mutually orthogonal. 


Problem 122 Compute commutators 
where ij take values x, y, and z. 

Problem 123 Assuming that vectors 


s] ,p> , 5j' p) ] for all i ^ j and sf p) , (s ( ' P) ) 2 


e'/jj and ef^j in Problem 121 correspond 


to spin-up and spin-down states of two particles as defined by operators S z 
correspondingly, compute 


( 1 , 2 ) 


{f\s 


(i) 


s (2) W), 


where vector \x//) is also defined in Problem 121. 
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Problem 124 Derive Eqs. 9.46 through 9.49. 

Problem 125 Consider a system of two interacting spins described by Hamilto¬ 
nian: 


H = 2 J^s m B+ 2 -^s' 1, B + Js"'-s' 1 \ 

n n 


Find the eigenvalues and eigenvectors of this Hamiltonian. Do it in two different 

^(1 2 ) 

ways: first, use eigenvectors of individual S z ’ operators as a basis, and second, use 
eigenvectors of the operators of the total spin. Find the ground state of the system 
for different relations between the magnetic field and parameter J. Consider cases 
J > 0 and / < 0. 


For Sect. 9.5.1 

Problem 126 Using the approach presented in Sect. 9.4, consider addition of the 

operators of the orbital angular momentum and spin, limiting your consideration to 

the orbital states with 1=1. 

1. Construct the matrix of the operator J 2 , where J = L + S, in the basis 

-2 - 

of eigenvectors of operators L , L z , and S z , taking into account only those 
eigenvectors which belong to the orbital quantum number 1=1. (Hint: Your 
basis will consist of 6 vectors, so that you are looking for a 6 x 6 matrix.) 

2. Diagonalize the matrix and confirm that eigenvectors of J 2 are characterized by 
quantum numbers j = 1/2 and j = 3/2. 

3. Find the eigenvectors of J 2 in this basis. 

Problem 127 

1. Write down an expression for a spinor describing the equal superposition of 
states, in which an electron in the ground state of an infinite one-dimensional 
potential is also in a spin-up state, while an electron in the first excited state of 
this potential is also in the spin-down state. The potential confines the electron’s 
motion in v direction, while spin-up and spin-down states correspond to the 
z-component of the spin. 

2. Imagine that you have measured a component of the spin in the v direction and 
obtained value ti/2. Find the probability distribution of the electron’s coordinate 
right after this measurement. 

Problem 128 A one-dimensional harmonic oscillator is placed in a state 

i«> = 2=[|o>it> + iim>], 
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where spin-up and spin-down states are defined with respect to the z-component of 
the spin operator and kets |0) and |1) correspond to the ground state and the first 
excited state of a harmonic oscillator. At time / = 0 an experimentalist turns on a 
uniform magnetic field in the z direction. Find the state of the system at a later time 
t, and compute the expectation values of oscillator’s coordinate and momentum. 
(Hint: You can use Eqs. 9.68 and 9.69 with the orbital part of the Hamiltonian taken 
to be that of a harmonic oscillator.) 


For Sect. 9.5.2 

Problem 129 Compute the expectation value of all components of the operator 

j = L + S 


as well as of operator J 2 in state 


\x) 


1 

Vl4 


Y 1 ,- 2 (o,<p) 


2 Y‘(d,<p) 



+ 3iYf ( 0, <p) 



Problem 130 Derive Eq. 9.92. 

Problem 131 Consider an electron in a state with / = 2, j = 3/2, and mj = 0. 
If one measures the z-components of the electron orbital momentum and spin, what 
are the possible values and their probabilities? 

Problem 132 Let me reverse the previous problem: assume that the electron is in 
the state with / = 2, m = 1, and m s = —1/2. What are the possible values of j and 
their probabilities? 

Problem 133 Consider an electron in the following state (in the coordinate repre¬ 
sentation): 


i«) 


TTo 


Y\{9,<P) 



1 

VTo 


Y° 2 (e,cp) 



i 

vTo 


iT 1 


+ 7ro ¥ ^ v) 



1. If one measures J 2 and J z , what values can one expect to observe and what are 
their probabilities? 

2. Present this vector as a linear combination of appropriate vectors | j, /, mj). 
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Section 9.5.2 

Problem 134 Compute commutators and p x ,/ z J, and demonstrate that 

they have a standard for the angular momentum operators form. 

Problem 135 Write down the position-spinor representation of vector \l — 1/2, 
l,mj), and compute L_S + \l — 1/2 ,l,mj) using this representation. 


Chapter 10 

Two-Level System in a Periodic External 
Field 


Check for 
updates 


I have already mentioned somewhere in the beginning of this book that while vectors 
representing states of realistic physical systems generally belong to an infinite¬ 
dimensional vector space, we can always (well, almost, always) justify limiting 
our consideration to a subspace of states with a reasonably small dimension. The 
smallest nontrivial subspace containing states that can be assumed to be isolated 
from the rest of the space is two-dimensional. One relatively clean example of such 
a subspace is formed by two-dimensional spinors in the situations when one can 
neglect interactions between spins of different particles as well as by the spin-orbital 
interaction. An approximately isolated two-dimensional subspace can also be found 
in systems described by Hamiltonians with discrete spectrum, if this spectrum is 
strongly non-equidistant, i.e., the energy intervals between adjacent energy levels 
A i = Ei+\ — E[ are different for different pairs of levels. Two-level models are 
very popular in various areas of physics because, on one hand, they are remarkably 
simple, while on the other hand, they capture essential properties of many real 
physical systems ranging from atoms to semiconductors. 

The most popular (and useful) version of this model involves an interaction of a 
two-level system with a periodic time-dependent external “potential.” This can be 
an electric dipole potential describing interaction of an atomic electron with electric 
field or magnetic “potential” describing interaction of an electron spin with time- 
dependent magnetic field. Since I am not going to go into concrete details of a 
physical system, which this model is supposed to represent, I will introduce it by 
assuming that its Hamiltonian is a sum of a time-independent “unperturbed” part 
Ho and the time-dependent “perturbation” V(t). I will also assume that Ho has only 
two linearly independent and orthogonal eigenvectors, which I will designate as 11) 
and |2), and two corresponding eigenvalues E^ and Ef\ which may be degenerate. 

It is easy to see now that Ho can be written as 

Ho = E® |1) (1| +Ef |2) (2|. (10.1) 
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Indeed, taking into account the orthogonality and normalization of 11) and |2), you 
can find 


H 0 |l) =£j 0 ) |l>(l| 1 >+ 4 0 ) | 2 > (21 1 ) =£j 0 ) |l>, 


and 


H 0 12) = Ef ] |1) (1| 2) + Ef 12) (21 2) = E® |2), 

confirming that the Hamiltonian given by Eq. 10.1 does, indeed, have the properties 
prescribed to it. It is obvious that in the basis of these eigenvectors, H 0 is presented 
by a diagonal matrix with eigenvalues along the main diagonal. In the most general 
form, the interaction term can be written down as 

V = Vn |1) <1| + V 22 12) (21 + V 12 |1) ( 2 | + V 21 |2) <1|. 

The diagonal elements in this expression, V ix (t) = (i\V \i), often vanish, thanks to 
the symmetry of the system. Indeed, if the initial Hamiltonian is symmetric with 
respect to inversion, its eigenvectors have definite parity—they are either odd or 
even. If, in addition, the interaction Hamiltonian is odd (which is quite common— 
for instance, the electric-dipole interaction is proportional to r • £, where £ is 
the electric field, and position operator changes sign upon inversion), the diagonal 
elements of the interaction term must vanish (details of the arguments can be found 
in Sect. 7.1). Also, the requirement that the operator must be Hermitian demands 
that V 2 i = V* 2 . 


10.1 Two-Level System with a Time-Independent 
Interaction: Avoided Level Crossing 


I begin by considering the properties of the two-level model with a time-independent 
interaction term, so that the complete Hamiltonian of the system becomes 

H = Ef |1) (1| +Ef 12) (2| + Vi 2 |l) (2| + V* 2 12) (1| (10.2) 


where V# are in general complex constant parameters. Since this is a time- 
independent Hamiltonian, it makes sense to explore its eigenvectors and eigenvalues 
using vectors 11) and |2) as a basis. The Hamiltonian in this representation becomes 
a 2 x 2 matrix so that the eigenvector equation can be written in the matrix form 


Ef V12 
V* 2 Ef 


a\ 

= E 

a\ 

_Cl2_ 


_Cl2_ 


(10.3) 
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and the corresponding equation for the eigenvalues becomes 


Ef ] - E Vu 
Vt 2 Ef-E 


= 0 . 


Evaluation of the determinant turns it into a simple quadratic equation: 

E 2 — E (e^ + Ef ) + - \V l2 \ 2 = 0 

with two solutions (I provided a lot of detailed derivations in this book, but I am not 
going to show how to solve quadratic equations!) 


Ei = 

1 

" 2 

(e^ + Ef ) 

+ 5V 

1 (e® - E®) 

2 + 4|y 12 | 2 

(10.4) 

E 2 = 

1 

" 2 

(e® + £®) 

dv 

^(e® - E®)' 

2 + 4|V 12 | 2 . 

(10.5) 


Substituting the first of these solutions into 


(e^ -Eja i + V l2 a 2 = 0 


(the first of the equations encoded in the matrix form in Eq. 10.3), I find the ratio of 
the coefficients representing the first eigenvector of the Hamiltonian: 


,(0 

fi) 

2 


Vl2 


17(0) _ 17(0) _ 

^ 1 ^2 


^/(£®-£f) 2 + 4|y 12 | 2 


( 10 . 6 ) 


Repeating this calculation with the second eigenvalue, I find the ratio of the 
coefficients for the second eigenvector: 


( 2 ) 

- _9 

( 2 ) 

a\ 


Vu 


7(0) 


4 0) + 


v /(^ 0) -^° ) ) 2 + 4 | V 12 | 2 


(10.7) 


The normalization coefficients for these eigenvectors are too cumbersome and are 

not too informative, so I will leave the eigenvectors non-normalized. Both of them 

n 2 ^ 

can be written as a superposition of vectors 11) and |2) with coefficients a\ 2 defined 
by Eqs. 10.6 and 10.7: 


\E\, 2 ) = a ( /’ 2) |1) +4 U) \2) 


(10.8) 












332 


10 Two-Level System in a Periodic External Field 


where I used eigenvalues to label the corresponding eigenvectors. 

The ratios of the coefficients in this superposition determine relative contribu¬ 
tions of each of the original states into \E\^). These ratios depend on the relation 


- e: 


(o) 


and the interaction matrix 


between the inter-level spectral distance 

element | V 121 - If the former is much larger than the latter, I can expand the 
denominators of Eqs. 10.6 and 10.7 as 


(£< 0 ) -£f) +4|V 12 | 


2 ^ 77(0) 77(0) , ^ I ^121 

~ Hi 1 - Hij 


77(0) _ 77(d) 


where it is assumed for concreteness that > E 2 \ Then Eqs. 10.6 and 10.7 yield 


a 

a 


(i) 

j_ 

(i) 


2 


Vi 2 (fd - Ef ) 


»1 


a 

a 


( 2 ) 

1 _ 

( 2 ) 


2 


V12 



« I- 


Thus, the contributions of the state presented by vector |2) into the eigenvector 
|£i) and of state |1) into the eigenvector l^) are very small. Not surprisingly, the 
energy E\ in this limit is close to Ef\ and £2 is close to E^ (check it out, please). 
These results justify the assumption lying in the foundation of the two-level model: 
contributions from energetically remote states can, indeed, be neglected. It also 
provides a quantitative condition for validity of this approximation: | Ej® — E$ | 

| V nm |, where n, m are the labels for energy levels and the corresponding states. 

It is easy to verify that if I reversed inequality > E^ and assumed instead 


that < E 2 \ the role of vectors |1) and |2) would have interchanged: the main 
contribution to state |£i) would have come from initial vector |2), and state |£ 2 ) 
would have been mostly determined by 11). This flipping between the initial vectors 
is due to trivial but often overlooked property of the square root, Vx*=\x\, which 
is v when v is positive and —v when it is negative. In one of the exercises, you are 
asked to verify this flipping phenomenon. 


In the opposite limit 



| V 12 U Ih e radical in Eqs. 10.6 and 10.7 can 


be approximated as 


y(£[ 0 ) -£f) 2 + 4|y 12 | 2 «2|V 12 | (10.9) 

which is valid with accuracy up to terms of the order of / I V12 \ 2 1. 

The ratios of the coefficients in this case become 
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„ V 12 

a 2^ 

E ( 0 ) _ E (o) _ 2 \v n \ 


„ V l2 


E ( f -Ef +2\V n \ 



-e iSv 


1 + 


1 - 


E ( f _ E <fl) \ 

2 I'M ) 
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where I introduced the phase of the matrix element Vn = \Vn \ exp (iSy) and used 
approximation for (1 + x)~ l & l — x. Note that the correction to the main terms 
(=b exp [i8y]) in both expressions is linear in / \ Vn\, which justifies 

approximation for the radical used in Eq. 10.9 (neglected quadratic terms are smaller 
than the linear ones kept in the expressions for the coefficients). The contributions 
of the initial eigenvectors in this limit are almost equal to each other in magnitude 
while differing in their phase by tt (do I need to remind you that —1 = exp ( ijt)l ). 
Approximate expressions for the energy eigenvalues, Eqs. 10.6 and 10.7 in this limit, 
become (again neglecting quadratic terms in — E^J / | Vn\) 


Ex = 

1 

" 2 

(£[ 0) + Ef) 

1 + IV 12 I 

(10.10) 

E 2 = 

1 

“ 2 

(£[ 0) + Ef) 

1 — IV 12 I. 

(10.11) 


What is significant about this result is that even when the difference between initial 
energy levels is very small compared to the matrix element of the interaction, the 
difference between the actual eigenvalues is | V 121 and is not small at all. 

Experimentalists love the two-level models because they are simple (all what you 
need to know is how to solve quadratic equations), and they are tempted to use it 
as often as they can in disparate fields of physics. Theoreticians, of course, hate this 
model with as much fervor because if all of the physics could have been explained 
by a two-level model, all theoreticians would have lost their jobs. Luckily, this is 
not the case. 

The physics described by this model becomes particularly interesting (and 
important) if the initial Hamiltonian Ho depends on some parameters, which can 
be controlled experimentally in such a way that the sign of the difference Ef^ — E^ 
can be continuously altered. In this case, at certain value of this parameter, the two 
initial energy levels become degenerate, and if one plots dependence of Ef- and 
E 2 0) as functions of this parameter, the corresponding curves would cross at some 
point. This is an example of an accidental degeneracy, which is not related to any 
symmetry and occurs only at particular values of a system’s parameters. Still, it 
happens in a number of physical systems and is of great interest because it affects 
how the system reacts to various stimuli. If, however, one plots the dependence of 
the actual eigenvalues as functions of the same parameter, the curves would not 
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Fig. 10.1 An example of 
avoided crossing 



cross each other as is obvious from Eqs. 10.10 and 10.11. The curves representing 
this dependence will now look like the ones shown in Fig. 10.1. You can see that 
the curves do not cross each other anymore giving this phenomenon the name of 
avoided level crossing. 

This is a remarkable phenomenon, which is not easily appreciated. Let me try to 
help you to understand what is so special about these two curves not crossing each 
other. Let’s begin far on the left from the point of the degeneracy, where E ^ > Ef\ 
We ascertained that in this case the lower curve describes the energy of a state, 
which is mostly |2), while the state whose energy belongs to the upper curve is 
mostly 11). At the point of avoided crossing, the eigenvectors describing the state of 
the system consist of both |1) and |2) in equal proportions. Now let’s keep moving 
along the lower curve, which means that we are turning the dial and experimentally 
gradually changing our control parameter. After we will have passed the point of 
avoided crossing, the relation between initial energy levels has changed: now we 
have Ef } < Ef\ Now, the main contribution to the superposition represented by 
the points on the lower curve comes from the state II), 1 and if I move the system 
far enough from the avoided crossing point, I will have a state mostly consisting 
of the state |1). Now think about it: we started with the state of the system being 
predominantly |2), and by continuously changing our parameter, we transformed 
this state in the one which is now predominantly state 11). This is better than any 
Hogwarts style transformation wizardry simply because it is not a magic and not an 
illusion—just honest to earth quantum mechanics! 


Recall a comment I made at the end of the discussion of the limit l 5 0) — | Vi 2 |. 
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10.2 Two-Level System in a Harmonic Electric Field: 
Rabi Oscillations 


Now let me switch gears and allow the perturbation operator V to become a function 
of time. More specifically, I will assume that perturbation matrix elements Vn and 
V21 have the following form: 

V21 ( t ) = V n (0 = £ cos tit , 


where £ is real. This form of the perturbation describes, for instance, a dipole 
interaction between a two-level system and a harmonic electric field and appears 
in many realistic situations. The Hamiltonian of the system in this case reads 

H = Ef 11} (11 + Ef |2) (2| + £ cos Sit (| 1) (2| + |2> (11) . (10.12) 

This is the first time you are dealing with an explicitly time-dependent Hamiltonian 
in the Schrodinger picture, and this requires certain adjustments in the way of 
thinking about the problem. First of all, you have to accept the fact that you cannot 
present solutions in the form of exp (— iEt/fi ) with | \/f) being an eigenvector of 
the Hamiltonian. Equation H \ \f/) = E\\jr) with time-dependent Hamiltonian and 
time-independent | \/f) does not make sense anymore. In other words, the stationary 
states do not exist in the case of time-dependent Hamiltonians, and we need, 
therefore, a new way of solving the time-dependent Schrodinger equation. No one 
can forbid you, however, to use eigenvectors of any time-independent Hamiltonian 
as a basis, because basis is a basis regardless of the properties of the Hamiltonian. 
The choice of the basis is determined solely by the reason of convenience, and it is 
especially convenient in this case to use eigenvectors of Ho presented by vectors 11) 
and 12). Thus, let me present the unknown time-dependent state vector \^{t)) as a 
linear combination of these vectors: 

j |2> (10.13) 


\f(t)) = ai0)exp — 


0) f 


|1) + a 2 (0 exp - 


iE 


■(0) 


with some unknown coefficients a\ t 2 - This expression reminds very much Eq. 4.15 
for a general solution with a time-independent Hamiltonian but with two significant 
differences—first, the basis used in Eq. 10.13 is not formed by eigenvectors of the 
total Hamiltonian H, which does not have eigenvectors (at least not in a regular sense 
of the word), and second, the expansion coefficients now are unknown functions of 
time, while their counterparts in Eq. 4.15 were constants. You might wonder at this 
point if I am allowed to separate the exponential factors characteristic of the time 
dependence of the stationary states. A simple answer is: “Why not?” As long as I 
allow for the yet undetermined time dependence of the residual coefficients, I can 
factor out any time-dependent function I want. It will affect the equations, which 
these coefficients obey, but not the final result. The most meticulous of you might 
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also ask that even if it is allowed to pull out these factors, why bother doing it? This 
is a more valid question, which deserves a more detailed answer. Let me begin by 
saying that I did not have to do it: the earth would not stop in its tracks if I did not, 
and we would still solve the problem. However, by doing so, I reflect a somewhat 
deeper understanding of two distinct sources of the time dependence of the vector 
states. One is a trivial dependence given by these exponential factors, which would 
have existed even if the Hamiltonian did not depend on time. These exponential 
factors have nothing to do with the time dependence of the Hamiltonian. Factoring 
them out right away, I ensure that the remaining time dependence of the coefficients 
reflects only genuine nontrivial dynamics. As an extra bonus, I hope that by doing 
so, I will arrive at equations that are easier to analyze. 

Substitution of Eq. 10.13 to the left-hand side of the Schrodinger equation 
ifid \\jr) /dt yields 


ifi 


d\t) 

dt 


= E^a\ (t) exp ( — 


iE ( { n t\ da\ (t) ( iE\ 0> t' 

- |1> + in ——— exp ( — 


dt 


fi 


|i> + 


Efa 2 (t) exp 




| 2 > 


.. da 2 (t ) , 

in -exp 

dt 1 




|2). (10.14) 


The right-hand side of this equation, H\i/s), with H defined by Eq 10.12 and \x/s) by 
Eq. 10.13 becomes 


H\f) = EfV (0 exp ( I 11} + E^ ] a 2 (t) exp ( - 


iE^t' 


>( 0 ) 




h 


12) + 


ft) 


£<22(0 cos S^exp I- f — I 11) + £a\(t) cos ^^exp I- j — I |2) (10.15) 


(-<£) 


where I took into account the orthogonality of the basis states. Equating coefficients 
in front of vectors |1) and |2) on the left- and right-hand sides of the Schrodinger 
equation (Eqs. 10.14 and 10.15 correspondingly) results in differential equations for 
the time-dependent coefficients <21,2 (0 : 


ifi ——^ = £(22 (0 cos Qt exp 
dt 


(i[Ef-E ( ?y 


(10.16) 


ifi ^ Cl2 ^ = £a\ (t) cos fit exp 


/ i ^[ 0) - Ef ] t' 


dt 


(10.17) 
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Factors exp (^±i t/fi ^ on the right-hand side in these equations 

appeared as a result of eliminating the corresponding exponential factors 
exp iEf\t/fi^ from their left-hand sides. Note that energy eigenvalues appear 


in these equations only in the form of their difference, which is just another 
manifestation of the already mentioned fact that the absolute values of the energy 
levels are irrelevant. To simplify the notations, let me introduce a so-called transition 
frequency: 


77(0)_17(0) 

a>i2 = 1 , 2 (10.18) 

n 

where I again for concreteness assumed that Ef^ — E^ > 0. Introducing this 
notation and replacing cos Qt by the sum of the respective exponential functions, 
I can rewrite Eqs. 10.16 and 10.17 in the following form: 

da 1 (t) 1 

ifi —-— = -Sa 2 (t) (exp [i (con - ft) f] + exp [i (co 12 + ft) t ]) (10.19) 

dt 2 

da 2 (t) 1 

ifi —-— = -£a\(t) (exp [-/ (con - ft) t] + exp [-/ (co 12 + ft) t]) (10.20) 
dt 2 

Equations 10.19 and 10.20 cannot be solved analytically. However, the most 
interesting phenomena described by these equations occur when co\ 2 — ft 
con + ft, in which case I can introduce an effective approximation capturing the 
most important properties of the model (obviously, something will be left out, and 
there might be situations when this something becomes important, but I am going to 
pretend that such situations do not concern me at all). In order to formulate this 
approximation, it is convenient to introduce a parameter A = ( 0 \ 2 — ft called 
frequency detuning. In the case of the small detuning, the two exponential terms in 
Eqs. 10.19 and 10.20 change with time on significantly different time scales. Terms 
containing co\ 2 — ft oscillate with a much larger period (much slower) as compared 
to the terms containing co\ 2 + ft, which exhibit comparatively fast oscillations. 

In order to understand why fast oscillations are not effective in influencing the 
behavior of the system, imagine a regular pendulum acted upon by a force, which 
changes its direction faster than the pendulum manages to react to it (it is called 
inertia, in case you forgot, and it takes some time for any quantity to change by 
any appreciable amount). What will happen to the pendulum in this case? Right 
before it has any chance to move in the initial direction of the force, the force will 
have already changed and push the pendulum in the opposite direction. This is a 
very frustrating situation, so the pendulum will just stay where it is. This effect 
in a scientific jargon is called self-averaging—the force changes so much faster 
than the reaction time of the pendulum that it effectively averages itself out to zero. 
Taking advantage of this self-averaging effect, I will drop the fast-changing terms 
in Eqs. 10.19 and 10.20, turning them into 
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ih — i(0 _ ex p 

dt 2 

iti — 7 ^- = -Sa\(t) exp (— iAt) 
dt 2 

Differentiating the first of these equations with respect to time, I get 

ifi da \(t) _ }_g da,2(t) ^ (j/\t) + -z'A£a 2 (£) ex P (*Af) . 

dt z 2 dt 2 

Now, taking da 2 /dt from Eq. 10.22 while expressing a 2 (0 in terms of da\/dt using 
Eq. 10.21, I am getting rid of coefficient a 2 and derive an equation containing 
only a i : 


( 10 . 21 ) 

( 10 . 22 ) 


da 2 At) da\{t) 

—h - i A- 

dt 2 dt 


+ 4fp £2ai ® -°' 


(10.23) 


Did you notice how the time-dependent exponents in Eq. 10.23 magically disap¬ 
peared turning it into a regular linear differential equation of the second order 
with constant coefficients? You might notice that this is the same equation which 
describes (among other things) a motion of a damped harmonic oscillator with 
damping represented by a term with the first time derivative. This might appear a bit 
troublesome, because the motion of a damped harmonic oscillator is characterized 
by exponential decay of the respective quantities with time, and this is not the 
behavior which we would like our quantum state to have. However, before going 
into a panic mode, look at the equation a bit more carefully, and then you might 
notice that “the damping” coefficient (whatever appears in front of da\/dt) is purely 
imaginary, so no real damping takes place, and you can breathe easier. 

Damping or no damping, I know that equations of the type of Eq. 10.23 are solved 
by an exponential function, which I choose in the form of exp (loot). Substitution of 
this function into Eq. 10.23 yields an equation for the yet unknown parameter co: 


co 2 — A co — -Qr = 0 


where I introduced a new quantity of the dimension of frequency 


£2* = j, (10.24) 

n 

which plays an important role in the phenomena we are about to uncover. The 
quadratic equation for co has two solutions: 


co± 


-A =b - 
2 2 




~b £2 


(10.25) 
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(both of which are, by the way, real) so that the general solution to Eq. 10.23 takes 
the form 


a\ = A exp {ico+t) + B exp ( ico-t ) . 


(10.26) 


Expression for the second coefficient, <22, is found using Eq. 10.21: 

2 ifi da\(t) 

a 2 = — exp (-/Af) —— = 
c at 

2 

-exp (—iAt) [Aco+ exp ( ico+t ) + Bco- exp (ico-t )]. 

Qr 


Combining exponential functions in this equation, you might notice the emergence 
of two frequencies, co+ — A and co- — A, which can be evaluated into 


— A — — -A + -yj A 2 + — —co- 

(0--A = - A- iy A 2 + = _ w+ 

allowing you to write an expression for <22 as 
2 

(22 =- [Aco+ exp (— ico-t ) + exp (—ww+f)]. (10.27) 

Amplitudes A and Z? in Eqs. 10.26 and 10.27 are yet undetermined; to find them 
I have to specify initial conditions for Eqs. 10.21 and 10.22, the issue which I 
have not even mentioned yet. At the same time, you are perfectly aware that any 
problem involving a time evolution is not complete without initial conditions, which 
in quantum mechanics mean a state of the system at some instant of time defined 
as t = 0. 

It is usually assumed in this type of problems that one can “turn on” the time- 
dependent interaction at some instant determined by the will of the experimentalist, 
and in many cases it does make sense. For instance, the time-dependent term in 
Hamiltonian 10.12 can represent a laser beam, which you can, indeed, turn on and 
off at will. In this case one can prepare the system to be in a specific state before the 
laser is turned on and study how this state will evolve due to the interaction with the 
laser radiation. It is simplest to prepare the system in the lowest energy stationary 
state, and so this is what I will choose as the initial condition: 


IiK0)> = 1 2 ). 

Taking into account Eq. 10.13, 1 can translate it into the following initial conditions 
for the dynamic variables a\ and <22: 
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0!(O) = O (10.28) 

a 2 ( 0) = 1. (10.29) 

Substituting t = 0 into Eqs. 10.26 and 10.27 and using Eqs. 10.28 and 10.29, 1 derive 
the following equations for amplitudes A and B: 


A + B = 0; 


-+ Bed— ] — 1 , 

Qr 


which are easily solved to yield 

A = —B = — 


£2 r 


2 (&>+ — co-) 


It is easy to see using Eq. 10.25 that 


C0+ — CO- = 



+ Qr, 


so that the amplitudes take on the value 



Having found A and B , I can write down the final solutions for the time-dependent 
coefficients 


a\ = — : [exp ( ico-t ) — exp (ico+t)\ (10.30) 

2yA 2 + Q 2 r 

ci 2 = _ [co- exp (— ico+t) — co + exp (— ico-t)] . (10.31) 

y A 2 + 

These equations formally solve the problem I set out for you to solve: you 
now know the time-dependent state of the two-level system described by Hamil¬ 
tonian 10.12 at any instant of time. But I wouldn’t blame you if you still have this 
annoying gnawing feeling of not being quite satisfied, probably because you are not 
quite sure what to do with this solution and what kind of useful physical information 
you can dig out from it. Indeed, the standard interpretation of coefficients in expres¬ 
sions similar to Eq. 10.13 as probability amplitudes, whose squared absolute values 
yield the probability of obtaining a corresponding value of an observable whose 
eigenvectors are used as a basis, wouldn’t work here. The problem is that we are 
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using the basis provided by eigenvectors of the Hamiltonian of a system, which 
does not exist anymore, so that this traditional interpretation does not make much 
sense. 

One way to make sense out of Eqs. 10.26 and 10.27 is to recognize that in a 
typical experiment, the time-dependent interaction does not last forever—it starts at 
some instant, which you can designate as t = 0, and it usually ends at some time 
t = tf (for instance, when a graduate student running the experiment gets tired, 
turns the laser off, and goes on a date). So, after the time-dependent part of the 
Hamiltonian vanishes, you are back to the standard situation, but the system is now 
in a superposition state defined by the values of the coefficients a\^ at the time, 
when the laser got switched off. Now, you can quickly take the measurement of the 
energy and interpret the results in terms of probabilities of getting one of two values: 

or Ef\ The probability p that the measurement would yield E^ is given 

as usual by \a\ | 2 , which according to Eq. 10.30 is 


’(isf) = 


^ R 


4 (A 2 + £2|) 


(2 — exp [/ (&>+ — a)-) tf ] — exp [—i (a)+ — co-) //•]) = 


/? r , -| ^2 n . 2 60-f CO — 

—7— -7- 1 — COS (<Z>+ - 60 -) tf = 7— -TV sin --- tf 

2 (A 2 + nl) 1 v 2 n (A 2 + 2 2 


^ R 


(A 2 + ^) 


sin 


y A 2 + Q 


(10.32) 


The probability that this measurement would yield value E^ could have been 
computed in exactly the same manner, and I will give you a chance to do it, 
as an exercise, but here I will be smart and take advantage of the fact that 
p f Ef^) + p yE^ ) = 1, so that without much ado, I can present you with 


1 - 




i E 0 ■ (A2 + J2J) 


sin 


2 + £> 


- f f = 




l 


A 2 




(A 2 + ^) 


(10.33) 


Equations 10.32 and 10.33 create a clear physical picture of what is happening with 
our system. The first thing to note is the periodic oscillations of the probabilities with 

time with frequency Qgr = -^A 2 + called generalized Rabi frequency (note 
that the factor 1 /2 in the arguments of the cos and sin functions in these equations 
is the result of transition from cosv to the functions of x/2 and is, therefore, 
not included into the definition of the frequency). There exist special times tj n = 
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Fig. 10.2 Oscillations of 
p (Ef } ) for three values of 
the detuning: A = 

0, A/Qr = 0.5, and 
A/Q r = 1.5 



7tn/£2 gr, where n is an integer, when the probability that the system will be found 
in the higher energy state is zero, and there are times tf n = nn/ SIgr + k/2 when this 

probability acquires its maximum value (A 2 + £2|). For probability p 
the situation is reversed—the probability reaches its value of unity at certain times 
tf n = Tin/£2 gr, but its minimum value occurring at tf n = nn/ £2 gr + x /2 is not zero, 
but is equal to A 2 / (A 2 + £2|). Figure 10.2 depicts these oscillations of probability 
known as Rabi oscillations. The period of these oscillations as well as maximum 
and minimum values of the corresponding probabilities depend on the detuning 
parameter A controlled by the experimentalists. For large detuning A Q R , the 
frequency of oscillations is determined mostly by A, but their swing (a difference 
between largest and smallest values) diminishes. For instance, the largest value of 
p becomes of the order of Q\/ A 2 1, while the smallest value of p 

is in this case very close to unity: 1 — £l\/ A 2 . For both probabilities there are not 
much oscillations to speak of. A more interesting situation arises in the case of 
small detuning, with the special case of zero detuning being of most interest. The 
frequency of Rabi oscillations in this case becomes smallest and is equal to Qr, 
which is called Rabi frequency, and the probabilities swing between exact zero and 
exact unity becoming the most pronounced. 

If you are interested how one can observe Rabi oscillations, here is an example 
of how it can be done. Imagine that you subject an ensemble of two-level systems 
to a strong time-periodic electric field with small detuning and turn it off at different 
times. The timing of the switching-off will determine the probability of a two-level 
system to be in the higher energy state. The fraction of the systems in the ensemble 
in this state is proportional to the corresponding probability. The systems will 
eventually undergo transition to the lower energy level and emit light. The intensity 
of the emitted light will be proportional to the number of systems in the upper state 
and will change periodically with the switching-off time. In real experiments there is 
no actual need to turn the electric field on and off all the time because spontaneous 
transitions of the system from upper to lower energy states happen even with the 
electric field on, and when this transition happens, the system is kicked off its normal 
dynamic so hard that it forgets everything about what was happening to it before 
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that, so that the whole process starts anew. These kicks serve effectively as switches 
for the electric field. Oscillations in this case can be observed as functions of Rabi 
frequency controlled by the strength of the applied electric field. It is important that 
Rabi oscillations can be observed only if their period is shorter than the time interval 
between the “kicks.” To fulfill this condition, the applied electric field must be strong 
enough to yield oscillations with a sufficiently high frequency. 


10.3 Problems 
Problems for Sect. 10.1 

Problem 136 Find the approximate expression for energy levels of a two-level 
system with a time-independent perturbation in the limit |V12I \E\ — E 2 \ for 
two cases: E\ > E 2 and E\ < E 2 . 

Problem 137 Assume that the perturbation part of the Hamiltonian is given by 
V = z £, where £ is the electric field and z is the respective coordinate operator. 
Assume also that the wave functions of the states included in the Hamiltonian are 
described (in the coordinate representation) by wave functions defined on the one¬ 
dimensional interval — 00 < z < 00: 


(z\ E i) = exp (— |z| / a B ) 



where a B is the Bohr radius for an electron in the hydrogen atom, and that the electric 
field is given in terms of the binding energy of the hydrogen atom W/, and electron’s 
charge e as £ = Wb/ea B . The unperturbed energy levels are given as 


E\ = W b {l+u) 
E 2 = W b {\ — u), 


where u is a dimensionless parameter that can be changed between values —2 and 

2 . 

1. Find the perturbation matrix Vy. 

2. Find the eigenvalues of the full Hamiltonian, and plot them as a function of the u. 

3. Also find the eigenvectors of the full Hamiltonian, and plot the ratio of the relative 
weights of the initial vectors \E\ f2 ) to the both found eigenvectors as functions 
of u. 

4. Also, consider phases of the ratio of the coefficients c\/c 2 for both eigenvectors, 
and plot their dependence on parameter u. 
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5. In all plots pay special attention to the region around u = 0, and describe how 
the behavior of the eigenvalues of the perturbed Hamiltonian differs from the 
corresponding behavior of the unperturbed energies. 

6. Describe the behavior of the absolute values and the phases of c\/c 2 in the 
vicinity of u = 0. 


Problems for Sect. 10.2 

Problem 138 Find the probability that the measurement of the energy will yield 
value E 2 directly from the coefficient a 2 in Eq. 10.33, and verify that the expression 
for this probability given in the text is correct. 

Problem 139 Find the time dependence of the probabilities p(E\f) assuming that 
at time t = 0 the system was in the state 11). 


Chapter 11 

Non-interacting Many-Particle Systems 


Check for 
updates 


11.1 Identical Particles in the Quantum World: Bosons 
and Fermions 

Quantum mechanical properties of a single particle are an important starting point 
for studying quantum mechanics, but in real experimental and practical situations, 
you will rarely deal with just a single particle. Most frequently you encounter 
systems consisting of many (from two to infinity) interacting particles. The main dif¬ 
ficulty in dealing with many-particle systems comes from a significantly increased 
dimensionality of space, where all possible states of such systems reside. In Sect. 9.4 
you saw that the states of the system of two spins belong to a four-dimensional 
spinor space. It is not too difficult to see that the states of a system consisting of N 
spins would need a 2^-dimensional space to fit them all. Indeed, adding each new 
spin 1/2 particle with two new spin states, you double the number of basis vectors 
in the respective tensor product, and even the system of as few as ten particles 
inhabits a space requiring 1024 basis vectors. More generally, imagine that you 
have a particle which can be in one of M mutually exclusive states, represented 
obviously by M mutually orthogonal vectors (I will call them single-particle states), 
which can be used as a basis in this single-particle M-dimensional space. You can 
generate a tensor product of single-particle spaces by stacking together M basis 
vectors from each single-particle space. Naively you might think that the dimension 
of the resulting space will be M N , but it is not always so. The reality is more 
interesting, and to get the dimensionality of many-particle states correctly, you need 
to dig deeper into the concept of identity of quantum particles. 

In the classical world, we know that all electrons are the same—the same charge 
and the same mass—but if necessary we can still distinguish between them saying 
that this is an electron with such and such initial coordinates and initial velocity, and 
therefore, it follows this particular trajectory. A second electron, which is exactly the 
same as the first one, but starting out with different initial conditions, follows its own 
trajectory. And even if these two electrons interact, scatter off each other, we can still 
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Fig. 11.1 Two 

distinguishable classical 
electrons interact with each 
other and follow their own 
distinguishable trajectory. We 
can easily say which electron 
follows which trajectory 


Fig. 11.2 Propagating clouds 
of probabilities representing 
the particles. In the 
interaction region, the clouds 
overlap, and the individuality 
of the particles is lost 
(Warning: it is dangerous to 
take this cartoon too 
seriously!) 



* 


say which electron is which by following their trajectories (see Fig. 11.1). Thus, we 
say that classical electrons, even though they are identical, are still distinguishable. 

The situation changes when you are talking about quantum particles. In essen¬ 
tially the same setup—two particles approach each other from opposite directions, 
interact, and move each in its new direction—the situation becomes completely 
different. Now instead of two well-localized particles with perfectly defined tra¬ 
jectories, you are dealing with moving amorphous clouds of probabilities, and when 
they approach each other and overlap, all you can measure is the probability to find 
one or two particles within a certain region of space, but you have no means to 
tell which of the observed particles is which (Fig. 11.2). In quantum mechanics the 
individuality of particles is completely lost—they are not just identical, but they are 
indistinguishable. 

Now the questions arise: how to formally describe this indistinguishability, and 
what are the observable consequences of this property? To begin, let me formally 
assign numbers 1 and 2 to the two particles and assume that particle 1 is in the 
state described by vector \a^), where a indicates a particular quantum state and 1 
assigns this state to the first particle, and the second particle is in the state 
The space of the two-particle states can be generated by the tensor product of the 
single-particle states with a two-vector basis: 
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fl‘ p)] j=\a w )\P {2) ) (11.1) 

t 2 ( ' p) )=| a (2) )\p m ), (11.2) 

where the second vector is obtained from the first by replacing particle 1 with 
particle 2. This operation can be formally described by a special “exchange” 
operator 7^(1,2) whose job is to interchange indexes of the particles assigned to 
each state: 


\ a V)\pW) = P(l,2)\a< l) )\p 2) ). 

When applied twice, this operator obviously leaves the initial vector intact (two 
exchanges 1 —> 2, 2 —> 1 are equivalent to no exchange at all), meaning that 
7^ 2 (1,2) = / . An immediate consequence of this identity is that eigenvalues of 
this operator are either equal to 1 or — 1. 

Using the exchange operator, the concept of indistinguishability can be for¬ 
mulated in a precise and formal way. Consider an arbitrary state of two particles 
represented by vector \x/r(l, 2)). If particles 1 and 2 are truly indistinguishable, then 
vector |i/^(2,1)) = V(1,2) \ x/r(l, 2)) and initial vector |^(1,2)) must represent the 
same state, which means that they can differ from each other only by a constant 
factor. The formal representation of the last statement looks like this: 

P(l,2)hKl,2)) = |^(2,1)) = A |^(1,2)), (11.3) 

which makes it clear that if | xj/ (1,2)) represents a state of indistinguishable particles, 
it must be an eigenvector of 7^(1,2). The remarkable thing about this conclusion 
is that there are only two types of such eigenvectors—those corresponding to 
eigenvalue 1 and those that belong to eigenvalue —1, i.e., any vector describing a 
state of indistinguishable particles must belong to one of two classes: symmetric 
(even) with respect to the exchange of the particles when 

P(l,2)hKl,2)) = |^(l,2)) (11-4) 

or antisymmetric (odd) if 

P(1,2)|VK1,2)) =—hKl,2)). (11.5) 

Moreover, Hamiltonians of indistinguishable particles obviously do not change 
when the particles are exchanged (otherwise they wouldn’t be indistinguishable), 
which means that the exchange operator and the Hamiltonian commute: 
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(if it is not clear where it comes from, check the discussion around the parity 
operator in Sect. 5.1, where similar issues were raised). In the context of the 
exchange operator, Eq. 11.6 signifies two things. First is that the Hamiltonian and the 
exchange operator are compatible and share a common system of eigenvectors. In 
other words, eigenvectors of a Hamiltonian of two indistinguishable particles must 
be either symmetric or antisymmetric. Second, if you treat the exchange operator 
as a representative of a specific observable that takes only two values depending 
on the symmetry of the state, Eq. 11.6 indicates that the expectation value of this 
observable does not change with time (see Sect. 4.1.3 and the discussion around 
Eq. 4.17 there). Accordingly, Eq. 11.6 ensures that if a two-particle system starts out 
in a symmetric or antisymmetric state, it will remain in this state forever. 

While it is useful to know that all states of indistinguishable particles must belong 
to one of the two symmetry classes and that a system put in a symmetry class at some 
instant of time will stay in this class forever, we still do not know how to relate the 
symmetry of the states of the particular system of particles to their other properties: 
does it depend on the particle’s charges, masses, and potential they are moving on, 
or can it be somehow created deliberately through a clever measurement process? 
I personally find the answer to all these questions, which I am about to reveal to 
you, quite amazing: the symmetry of any state of indistinguishable particles cannot 
be “chosen” or changed; the particles are bom with predestined fate to be only 
in states with one or another symmetry predetermined by their spin. It turns out 
that particles with half-integer spin can exist only in antisymmetric states, while 
particles with integer spins can be only in symmetric states. This statement is called 
a spin-statistics theorem and is, in my view, one of the most amazing fundamental 
results, which follows purely mathematically from the requirement that quantum 
mechanics agrees with the relativity theory. Just stop to think about it: quantum 
mechanics deals with phenomena occurring at very small spatial, temporal, mass, 
and energy scales, while relativity theory explains the behavior of nature at very 
large velocities. Apparently, quantum mechanics and relativity overlap when the 
interaction between light and matter is involved and, at high energies, when particle- 
antiparticle phenomena become important. However, the theoretical requirements of 
the self-consistency of the theory, one of which is the spin-statistics theorem, are felt 
well outside of these overlap areas and penetrate all of quantum mechanics from 
atomic energy structure to electric, magnetic, and optical properties of solids and 
fluids. Two better known phenomena made possible by the spin-statistics connection 
are superfluidity and superconductivity. The proof of this theorem relies heavily on 
quantum field theory, and you will sleep better at night by just accepting it as one of 
the axioms of quantum mechanics. 

The spin-statistics theorem was proved by Wolfgang Pauli in 1939 but published 
only in 1940. The footnote to Pauli’s paper in Physical Review states that the paper 
is a part of the report prepared for the Solvay Congress 1939, which did not take 
place because of the war in Europe. By the time of that publication, Pauli had 
moved from Zurich to Princeton, because Switzerland rejected his request for Swiss 
citizenship on the ground of him becoming a German citizen after Hitler annexed 
his native Austria. Anyway, to finish with this theorem, I only need to mention 
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that the particles with half-integer spins are called fermions (in honor of Enrico 
Fermi, an Italian physicist, who had to leave Italy after Mussolini came to power; 
he moved to the USA, where he created the world’s first nuclear reactor and played 
a crucial role in the Manhattan Project), while particles with whole spins are called 
bosons after Indian physicist Satyendra Nath Bose, who worked on the system of 
indistinguishable photons as early as in 1924. It is interesting that after an initial 
attempt to publish his paper on this topic failed, Bose sent it to Einstein asking 
Einstein’s opinion and assistance with publication. Einstein translated the paper into 
German and published in the leading German physics journal of the time Zeitschrift 
fur Physik (under Bose’s name, of course). 

Before getting back to the business of doing practical quantum mechanics with 
many-particle systems, a few additional words about fermions and bosons might be 
useful. Among elementary particles constituting regular matter, fermions are most 
abundant: electrons, protons, and neutrons—the main building blocks of atoms, 
molecules, and the rest of the material world are all spin 1/2 fermions. The only 
elementary boson you would encounter in regular setting would be a photon— 
a quantum of the electromagnetic field—and not a regular material particle. This 
can be taken as a general rule—as long as we are talking about elementary 
particles, the matter is represented by fermions, while the interaction fields, and 
other objects, which would classically be presented as waves in quantum mechanics, 
become bosons. Other examples of bosons you can find are quantized elastic waves 
(phonons) or quantized magnetic waves (magnons). 

However, the concept of fermions and bosons can be extended to composite 
particles as long as the processes they are taking part in do not change their 
internal structure. The most famous examples of such composite particles are 
electron Cooper pairs (Cooperons, named after American physicist Leon Cooper 
who discovered them in 1956), responsible for superconductivity phenomenon, and 
He 4 nuclei, which in addition to two mandatory protons contain two neutrons, 
making the total number of particles in the nucleus equal to 4. In both these 
examples, we are dealing with composite bosons. Indeed, a pair of electrons, as 
you already know, can be in the state with total spin either 0 or 1, i.e., the spin 
of the pair is in either case integer. In the case of He 4 nucleus, there are four spin 
1/2 particles, and by diving them into two pairs, you can also see that the total 
spin of this system can again only be an integer. Another interesting example of 
composite bosons is an exciton in semiconductors, which consists of an electron 
and a hole, 1 both with spin 1/2. The extent to which the inner structure in all 
these examples can be neglected and the particles can be treated as bosons depends 
on the amount of energy required to disintegrate them into their constituent parts. 
For Cooperons this energy is quite small—of the order of 10 -3 eV—which explains 


Energy levels in a semiconductor are organized in bands separated by large gaps. The band, 
all energy levels of which are filled with electrons, is called a valence band, and the closest to 
its empty band is a conduction band. When an electron gets excited from the valence band, the 
conduction band acquires an electron, and a valence band losses an electron, which leaves in its 
stead a positively charged hole. Here you have an electron-hole pair behaving as real positively and 
negatively charged spin 1/2 particles. 
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why they can only survive at very low (below 10 K) temperatures; exciton binding 
energies vary over a rather large range between several millielectronvolts and several 
hundred millielectronvolts, depending upon the material, and they, therefore, survive 
at temperatures between 10 K and the room temperature (300 K). He 4 nucleus, of 
course, is the most stable of all the composite particles discussed: it takes the whole 
28.3 mega-electronvolts to take it apart. 

After this short and hopefully entertaining detour, it is time to get back to business 
of figuring out how to implement the requirements of the spin-statistics theorem 
in practical calculations. A generic vector representing a two-particle state and 
expressed as a linear combination of basis vectors 11.1 and 11.2 

hKl,2))=a, |« (1) )| P i2) )+a 2 |a (2) >|^ (1) ) 

with arbitrary coefficients ci \2 does not obey the required symmetry condition. 
However, after a few minutes of contemplation and silent staring at this expression, 
you will probably see that you can satisfy the symmetry requirements of Eq. 11.4 
by choosing a\ = < 22 , while Eq. 11.5 can be made happy with the choice a\ = —^ 2 . 
(If you are not that big on contemplation, just switch the particles in the expression 
for |^(1,2)), and write down the conditions of Eq. 11.4 or 11.5 explicitly.) If in 
addition to symmetry you want your two-particle states to be also normalized, you 
can choose for fermions 

|^/(l,2)> =-k[| a (D) 1^(2)) _| a (2)) 1^(1))] (11.7) 

and for bosons 

^\l2))=^=[\a^)\^)+\a^)\^)]. (11.8) 

While Eq. 11.7 exhausts all possible two-particle states for fermions, in the case of 
bosons, two more states, in which different particles occupy the same single-particle 
state, can be constructed: 


fi 2 \l,2))= |« (1) )|« (2) ) (11.9) 

xjff\\,2)) = \p il) )\p {2) ). (11.10) 

An attempt to arrange a similar state for fermions fails because you cannot have 
an antisymmetric expression with two identical states—they simply cancel each 
other giving you a zero. In other words, it is impossible to have a two-particle 
state of fermions, in which each fermion is in the same single-particle state. This 
is essentially an expression of famous Pauli’s exclusion principle, which Pauli 
formulated in 1925 trying to explain why atoms with even number of electrons are 
more chemically stable than atoms with odd electron numbers. He realized that this 
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can be explained requiring that there can only be one electron per single-electron 
state. If one takes into account only orbital quantum numbers such as principal 
number n , orbital number / < n , and magnetic number \m\ < l (see Chap. 8), the 
total number of available states is equal to n 2 , which does not have to be even. So, 
Pauli postulated the existence of yet another quantum quantity, which can only take 
two different values, making the total amount of quantum numbers characterizing a 
state of an electron in atom equal to 4 and the total number of available states 2 n 2 . 
The initial formulation of this principle was concerned only with electrons and was 
stated approximately like this: no two electrons in a many-electron atom can have 
the same values of four quantum numbers. Despite the success of this principle in 
explaining the periodic table, Pauli remained unsatisfied for two principal reasons: 
(a) he had no idea which physical quantity the fourth quantum number represents, 
and (b) he was not able to derive his principle from more fundamental postulates 
of quantum mechanics. The first of his concerns was resolved with the emergence 
of the idea of spin (see Sect. 9.1), but it took him 14 long years to finally prove the 
spin-statistics theorem, of which his exclusion principle is a simple corollary. 

Before continuing I would like to clear up one terminological problem. When 
dealing with many-particle systems, the word “state” might have different meanings 
when used in different contexts. On one hand, I will talk about states characterizing 
the actual many-particle system; Eqs. 11.7 through 11.10 give examples of such 
states for the two-particle system. On the other hand, I use single-particle states, 

such as or |/3 (2) ), to construct the many-particle states | ^(1,2)) or 
So, in order to avoid misunderstandings and misconceptions, let’s agree that the term 
“state” from now on will always refer to an actual state of a many-particle system, 
while single-particle states from this point forward will be called single-particle 
orbitals. Understood literally orbitals are usually used to describe single-electron 
states of atomic electrons, but I will take the liberty to expand this term to any 
single-electron state. Getting this out of the way, I now want to direct your attention 
to the following fact. In the system of two fermions with only two available orbitals, 
we ended up with just a single two-particle state. At the same time, in the case of 
the same number of bosons and the same number of orbitals, there are three linearly 
independent orthogonal two-particle states, and if we were to forget about symmetry 
requirements (as we would if dealing with distinguishable particles), we would have 
ended up with a four-dimensional space of two-particle states just like in the two- 
spin problem from Sect. 9.4. You can see now that the dimensionality of the space 
containing many-particle states severely depends on the symmetry requirements, 
and the naive prediction for this dimension to be M N turned to be only correct for 
distinguishable particles. 
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11.2 Constructing a Basis in a Many-Fermion Space 


While identical bosons are responsible for some fascinating phenomena such as 
superfluidity and superconductivity, the systems of many fermions are much more 
ubiquitous in the practical applications of quantum theory, and, therefore, I will 
mostly focus on them from now on. As always, the first thing to understand is the 
structure of the space in which vectors representing the states of interest live. This 
includes finding its dimension and constructing a basis. The problem of finding 
the dimension of a many-particle space is an exercise in combinatorics—the science 
of counting the number of different combinations of various objects. In the case 
of fermions, the problem is formulated quite simply: given N objects (particles) 
and M boxes (orbitals), you need to compute in how many different ways you can 
fill the boxes assuming that each box can hold only one particle, and an order in 
which the particles are distributed among the boxes is not important. Once you 
find one distribution of the particles among the boxes, it becomes a seed for one 
many-particle state. The state itself is found by permuting the particles among the 
boxes, adding a negative sign for each permutation and summing up the results. 
To understand the situation, better begin with a simplest case: M = N. When the 
number of particles is equal to the number of orbitals, you do not have much of a 
choice: you just have to put one particle in each box, and then do the permutations— 
you end up with a single antisymmetric state. As an example, consider three 




where the lower index 


particles that can be in one of three available orbitals 
enumerates the orbitals and the upper index refers to the particles. Assume that 
you put the first particle in the first box, the second particle in the second, and 
the third one in the third, generating the following combination of the orbitals: 
ot^^j af). Now, let me switch particles 1 and 2, generating combination 

— a^ If I switch the particles again, say, particles 1 and 3,1 will get 

the new combination 




M)\ 


* 3 . Note that the negative sign has disappeared 

because each new permutation brings about a change of sign. Making all 6 (3!) 
permutations, you will end up with a single three-particle state: 
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(3) 
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(3)\ 
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(3)\ 

«1 / 
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a 3 J 

a i / 

a 2 I 


( 11 . 11 ) 


In agreement with the permutation rules described above, all terms in Eq. 11.11 with 
negative signs in front of them can be obtained from the first term by exchanging 
just one pair of particles, while the terms with the positive sign are obtained by 
permutation of two particles. It makes sense, of course, because, as I said before, 
an exchange of any two fermions is complemented by a change of sign, in which 
case an exchange of two pairs of fermions is equivalent to changing the sign twice: 
H—> > +, which is, of course, the same argument as I made when deriving this 
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expression. Finally, if you are wondering how to choose the first, seeding, term, the 
answer is simple: it does not matter and you can start with any of them. The only 
difference, which you might notice, is that all negative terms could become positive 
and vice versa, which amounts to a simple overall negative sign in front of the whole 
expression, and this makes no physical difference whatsoever. 

Now, since, for every selection of the number of boxes equal to the number of the 
particles, you end up with just a single many-particle state, the total number of states 
is simply equal to the number of ways you can select N boxes out of M. This is a 
classical combinatorial problem with a well-known solution given by the number of 
combinations for N objects chosen out of M. Thus, the number of distinct linearly 
independent and orthogonal A-fermion states based on M available single-fermion 
orbitals (the dimensionality D (N, M) of the corresponding space) is 


D(N,M ) 



M\ 

N\ (M — N)\' 


( 11 . 12 ) 


You can verify this general results with a few simple examples. Let’s say that now 
you want to build a space of three-fermion states using five available single-fermion 
orbitals. According to Eq. 11.12 this space possesses 5!/(3!2!) = 10 basis vectors. 
Using the same notation as before, but now allowing index i to run from 1 to 5, 


you can generate the following ten seed vectors, in which each particle is assigned 
to a different orbital: 



Each of these seeds yields a single antisymmetric state in an exactly the same way 
as in the previous example. 

A bit of gazing at Eq. 11.11 might reveal you an ultimate truth about the structure 
of this expression: a sum of products of various distinct combinations of nine 


elements 



where each index takes three different values is grouped in three with 


alternating positive and negative signs. Some digging in your associative memory 
will bring to the surface that this is nothing but a determinant of a matrix whose rows 
are the three participating orbitals with different particles assigned to each row: 



\oil ,012,013) = 


(11.13) 
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where on the left of this equation I introduced a notation \oti,ot 2 ,oi 2 ) which 
contains all the information you need to know about the state presented on the 
right, namely, that this three-fermion state is formed by distributing three particles 
among orbitals |ofi), | a 2 ), and |« 3 ). The right-hand side of this expression gives 
you a good mnemonic rule about how to combine these three orbitals into an 
antisymmetric three-fermion state. Arranging the orbitals into determinants makes 
the antisymmetry of the corresponding state obvious: the exchange of particles 
becomes mathematically equivalent to the interchange of the columns of the 
determinant, and this operation is well known to reverse its sign. 

The idea to arrange orbitals into determinants in order to construct automatically 
antisymmetric many-fermion states was first used independently by Heisenberg and 
Dirac in their 1926 papers and expressed in a more formal way by John C. Slater, an 
American physicist, in 1929, and for this reason these determinants bear his name. 
That was a time when American physicists had to travel for postdoctoral positions 
to Europe, and not the other way around, so after getting his Ph.D. from Harvard, 
Slater moved to Cambridge and then to Copenhagen before coming back to the USA 
and joining the Physics Department at Harvard as a faculty member. 

A word of caution: the fermion states in the form of the Slater determinant are 
not necessarily the eigenvectors of a many-particle Hamiltonian, which, in general, 
can be presented in the form 


N N 




(11.14) 


1=1 


1=1 


where the first term is the sum of the single-particle Hamiltonians for each particle, 
which includes operators of the particle’s kinetic energy and might include a term 
describing the interaction of each particle with some external object, e.g., electric 
field, while the second term describes the interaction between the particles, most 
frequently the Coulomb repulsion between negatively charged electrons. The factor 
1 /2 in front of the second term takes into account that the double summation over i 
and j counts the interaction between each pair of particles twice: once as Vq and the 
second time as Vjj. The principal difference between these two terms is that while 
each Hi acts only on the orbitals of “its own” particle, the interaction term acts on 
the orbitals of two particles. As a result, any simple tensor product of single-particle 
orbitals is an eigenvector of the first term of the many-particle Hamiltonian, but not 
of the entire Hamiltonian. Consider, for instance, the three-particle state from the 
previous example. Picking up just one term from Eq. 11.11,1 can write 


\ L( 1 )\ 

( 2 )\ 

)r i 

a 2 J 


Q' 1 (1) |-h af^H 2 ^ 2) | + 
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(11.15) 
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Since all other terms in Eq. 11.11 feature the same three orbitals, it is obvious 
that all of them are eigenvectors of this Hamiltonian with the same eigenvalue, so 
that the entire antisymmetric three-particle state given by the Slater determinant, 
Eq. 11.13, is also its eigenvector. It is also clear that for any Slater determinant state, 
the eigenvalue of the non-interacting Hamiltonian is always a sum of the single¬ 
particle energies of the orbitals used to construct the determinant. If, however, one 
adds the interaction term to the picture, the situation changes as none of the single¬ 
particle orbitals can be eigenvectors of Vij, which acts on states of two particles, so 
that the Slater determinants are no longer stationary states of many-fermion system. 
This does not mean, of course, that they are useless—they form a convenient basis 
in the space of many-particle states, which ensures that all states represented in this 
basis are antisymmetric. This brings me back to Eq. 11.12, defining the dimension 
of this space and highlighting the main difficulty of dealing with interacting many- 
particle systems—the space containing the corresponding states is just too large. 

Consider, for instance, an atom of carbon, with its six electrons. You can start 
building the basis for the six-electron space starting with lowest energy orbitals 
and continuing until you have enough basis vectors. The two lowest energy orbitals 
correspond to principal quantum number n = 1, orbital and magnetic numbers 
equal to zero, and two spin numbers ±1/2: 11,0,0,1/2) and 11,0,0, — 1 /2). This is 
definitely not enough for six electrons, so you need to go to orbitals with n = 2, of 
which there are 8: |2,0,0,1/2), |2,0,0,-1/2), |2,1, —1,1/2), |2,1, —1, —1/2), 
|2,1,0,1/2), |2,1,0, —1/2), |2, 1,1, 1/2), and |2, 1,1, —1/2), where the notation 
follows the regular scheme | n, l, m,m s ) (I combined the spin number m s with orbital 
quantum numbers for the sake of simplifying the notation). If I limit the space to 
just these ten orbitals (and it is not the fact that orbitals with n = 3 should not be 
included), the total number of basis vectors in this space will be 10!/(6!4!) = 210. 
It means that using the Slater determinants as a basis in this space, I will end up 
with the Hamiltonian of the system represented bya210x210 matrix. Allowing 
the electrons to occupy additional n = 3 orbitals, all 18 of them, will bring the 
dimensionality of the six-electron space to 376,740.1 hope these examples give you 
a clear picture of how difficult problems with many interacting particles can be and 
explain why people were busy inventing a great variety of different approximate 
ways of dealing with them. Very often, the idea behind these methods is to replace 
the Hamiltonian in Eq. 11.14 by an effective Hamiltonian without an interaction 
term. The effects of the interaction in such approaches are always hidden in “new” 
single-particle Hamiltonians retaining some information about the interaction with 
other particles. A more detailed exposition of this issue is way beyond the scope of 
this book and can be found in many texts on atomic physics and quantum chemistry. 

Before continuing to the next section, let me consider a few examples involving 
non-interacting indistinguishable particles so that you could get a better feel for the 
quantum mechanical indistinguishability. 

Example 29 (Non-interacting Particles in a Potential Well.) Consider a system 
of three non-interacting particles in an infinite one-dimensional potential well. 
Assuming that the particles are (a) distinguishable spinless atoms of equal mass 
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m a , (b) electrons, and (c) indistinguishable spinless bosons, find three lowest energy 
eigenvalues of this system, and write down the corresponding wave functions 
(spinors when necessary). 

Solution 

(a) In the case of three distinguishable atoms, no symmetry requirements can 
be imposed on the three-particle wave function, so the ground state energy 
corresponds to a state in which all three atoms are in the same single-particle 
ground state orbital: 


(Zl.Z2.Z3) 



TtZ\ . 7tZ2 . ZIZ 3 
sm-sin-sm- 


L L L 


(11.16) 


with corresponding energy 
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(11.17) 


The second energy level would correspond to moving one of the atoms to the 
second single-particle orbital, so that I have for the three degenerate three- 
particle states 


. (3) / , f 2 \ 3 ■ 2nZ l ■ nZ 2 ■ nZ i 

¥l’\ (Zl, Z 2 , Z3) = v[l) Sln sin — sin ~T 
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with the corresponding energy 
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Finally, the next lowest energy will correspond to two particles moved to 
the second single-particle level with the wave functions and triple-degenerate 
energy level given by 
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tfl (Zl>Z2,Z3) = Si 
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(b) Electrons are indistinguishable fermions, so their many-particle states must 
be antisymmetric. The single-particle orbitals are spinors, formed as a tensor 
product of the eigenvectors of the infinite potential well and of the spin operator 
S z . For convenience, I will begin by writing down the single-particle orbitals 
in the symbolic form \n,m s ), where n corresponds to an energy level in 
the infinite well and m s is a spin magnetic number. To construct the vector 
representing the ground state of the three-electron system, I need to include 
three different orbitals with the lowest single-particle energies. Obviously these 
are 11, \), |1, l ), \2,m s ). The choice of the spin state in the third orbital is 
arbitrary, so that there are two different ground states with the same energy. The 
respective Slater determinant becomes 


11 . 1 , 2 ) 


|l.t)l |l,t>2 |l,t>3 

|1> t)l |1> 4^)2 >l<)3 

12,1)! |2,t>2 |2,t) 3 


where the lower subindex enumerates electrons, and I chose for concreteness 
the spin-up state for the spin portion of the third orbital. Notation 11,1,2) for 
the three-electron state was chosen in the form, which reflects the eigenvectors 
of the infinite potential, “occupied” 2 by electrons in this state. Expanding the 
determinant and pulling out the spin number into a separate ket, I have 


| 1 , 1 , 2 ) = 11 )! |t)l 11 ) 2 ll>2 | 2 ) 3 lt >3 + |l)l U>1 Il)2 lt>2 | 2 ) 3 lt >3 + 
|l)l It)! | 2 ) 2 |f) 2 Il) 3 11 ) 3 - |2) ! |t>! | 1 ) 2 lt) 2 Il )3 lt) 3 - |l)l lt)l Il)2 lt>2 

12)3 lt) 3 I-^) 1 11) 1 12) 2 11)2 11)3 11)3 • 


2 “Occupied” in this context means that a given orbital participates in the formation of a given 
many-particle state. 
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Bringing back the position representation of the eigenvectors of the well, the 
last result can be written down as 



To get a bit more comfortable with this expression, let’s apply operator 
H = H m + H (2) + H (3 \ 

where H {,) is a single-electron infinite potential well Hamiltonian, which in the 
spinor representation is proportional to a unit matrix: 
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0 ITsin^ir 0 irsin^-ir 0 1 r/7' 31 sin _ 
_sin E f L _ 0 _H (3) sin E f-_ _ 0 _ sin E f 1 _ 0 

0 sin H ii} sin s in sin ^22. 0 ) 

sin^-JL 0 JL 0 J“L 0 JL 0 J \_H (3 > sin ^fjj ' 

I understand that this expression looks awfully intimidating (or just awful), but I 
still want you to gather your wits and go through it line by line, and let the force 
be with you. The first thing that you shall notice is that every single-particle 
Hamiltonian affects only those orbitals that contain its own particle. Now 
remembering that each of the orbitals is an eigenvector of the corresponding 
Hamiltonian, you can rewrite the above expression as 



where E\^ are eigenvalues of energy corresponding to eigenvectors |1), |2) 
of the infinite potential well. Combining the like terms (terms with the same 
combination of single-particle orbitals), you will find 
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#|1,1,2) = (2Ei + £ 2 ) 11,1,2). 


The second eigenvector belonging to this eigenvalue can be generated by 
changing the spin state paired with the orbital state |2) from spin-up to spin- 
down, which yields 
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To get the next energy level and the corresponding eigenvector, I just need 
to move one of the particles to the orbital |2) \m s ), which means that the Slater 
determinant is now formed by orbitals 11, m s ), |2,|) , |2, f) with the arbitrary 
value of the spin state in the single-particle ground state. Using for concreteness 
the spin-up value in 11, m s ), I can write 
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The energy corresponding to this state is # 2 , 2,1 — #1 + ^#2 and coincides with 
Eq. 11.20 for energy of the second excited in the system of the distinguishable 
particles. Finally, to generate the next lowest energy level, one has to keep two 
orbitals corresponding to the second excited level of the well with different 
values of the spin number, and then the only choice for the third orbital would be 
to use one of two |3) | m s ) orbitals, which result in two degenerate eigenvectors, 
one of which is shown below: 
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(I derived this expression by simply replacing sin p everywhere with sin .) 
The respective energy value is given by 


£2,2,3 = £3 + 2£2 = 


m 2 it 2 
2L 2 m e 


(c) Now, let me deal with the system of three identical spinless bosons. The 
symmetry requirement for the three-boson system allows using all identical 
orbitals (the resulting state is automatically symmetric); thus, the ground state 
can be built of a single orbital 11) and turns out to be the same as in the case of 
distinguishable particles and with the same energy value (Eq. 11.17). A differ¬ 
ence from distinguishable particles arises when transitioning to excited states. 
Now, to satisfy the symmetry requirements, I have to turn three degenerate states 
of Eqs. 11.18 and 11.20 with energies given by Eqs. 11.19 and 11.21 into single 
non-degenerate states: 
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11.3 Pauli Principle and Periodic Table of Elements: 

Electronic Structure of Atoms 

While we are not equipped to deal with systems of large numbers of interacting 
particles, you can still appreciate how Pauli’s idea of exclusion principle helped 
understand the periodicity in the properties of the atoms. In order to follow the 
arguments, you need to keep in mind two important points. First, when discussing 
the chemical properties of atoms, people are interested foremost in the many-particle 
ground state, i.e., a state of many electrons, which would have the lowest possible 
energy. Second, since the Pauli principle forbids states in which two electrons 
occupy the same orbital, you have to build many-particle states using at least as 
many orbitals as many particles are in your system, starting with ground state 
orbitals and adding new orbitals in a way, which would minimize an unavoidable 
increase of the sum of single-particle energies of all involved electrons. This 
last point implicitly assumes that the lowest energy of non-interacting particles 
would remain the lowest energy even if the interaction is taken into account. This 
assumption is not always true, but the discussion of this issue is beyond the scope 
of this book. Anyway, having these two points in mind, let’s consider what happens 
with states of electrons as we are moving along the periodic table. Helium occupies 
the second place in the first row and is known as an inert gas, meaning that it is very 
stable and is not eager to participate in chemical reactions or form chemical bonds. 
It has two electrons, and therefore you need only two orbitals, which can have the 
same value of the principal number n = 1 to construct a two-electron state: 

|1,0,0,1/2)! |1,0,0, —1/2) 2 - |l,0,0,l/2) 2 11 , 0 , 0 ,- 172 )! . 

These two orbitals exhaust all available states with the same principal number. In 
chemical language, we can say the electrons in helium atom belong to a complete 
or closed shell. Going to the next atom, lithium Li, you will notice that it has 
very different chemical properties—lithium is an active alkali metal, which readily 
participates in a variety of chemical reactions and forms a number of different 
compounds gladly offering one of its electrons for chemical bonding. Three lithium 
electrons need more than two orbitals to form a three-electron state, so you must 
start dealing with orbitals characterized by principal number n = 2. There are 
eight of them, but only one is really required to form the lowest energy three- 
electron state, and as a result seven of those orbitals remain, using physicist’s 
jargon, “unoccupied.” Once you go along the second row of the periodic table, 
the number of electrons increases to four in the case of beryllium, five for boron, 
six for carbon, seven for nitrogen, eight for oxygen, nine for fluorine, and finally 
ten for neon. With an increasing number of electrons, you must add additional 
orbitals to be able to create corresponding many-electron states, so that the number 
of “unoccupied” orbitals decreases. As the number of available, unused orbitals is 
getting smaller, the chemical activity of the corresponding substances diminishes, 
until you reach another inert gas neon. To construct a many-electron state for neon 
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Table 11.1 Elements of the 
second row of the periodic 
table and electronic 
configurations of their ground 
states in terms of 
single-electron orbitals and 
the term symbols 


Element 

Configuration 

Term symbol 

Li 2 , 

\s 2 2s 1 

2 S\/2 

Be 4 

Is 2 2s 1 

% 

Bs 

ls 2 2s 2 2p' 

2 Pl/l 

C 6 

\s 2 2s 2 2p 2 

3 Po 

N-, 

ls 2 2s 1 2p i 

4 S 3 /2 

O* 

ls 2 2s 2 2p A 

3 p 2 

f 9 

\s 2 2s 2 2p 5 

2 P 3 /2 

Ne io 

ls 2 2s 2 2p 6 

1 So 


with ten electrons, you have to use all ten available orbitals with n = 1 and n = 2. 
Consequently, the electron structure of neon is again characterized as a closed shell 
configuration. A popular way to visualize this process of filling up the available 
orbitals consists in assigning numbers 1,2, • • • to the principal quantum number n , 
and letters s,p,f, and d to orbitals with orbital angular momentum number / equal 
to 0, 1,2, and 3, respectively. The configuration of helium in this notation, primarily 
used in atomic physics and quantum chemistry, would be Is 2 , where the first number 
stays for the principal number, and the upper index indicates the number of electrons 
available for assignment to orbitals with / = 0. The electronic structure of elements 
in the second row of the periodic table discussed above is shown in Table 11.1. 

You can see from this table that / = 0 orbitals are added first to the list 
of available single-electron states, and only after that additional six orbitals with 
/ = 1 and different values of m and m s are thrown in. The supposition here is 
that single-electron states with / = 0 would contribute less energy than the / = 1 
states 3 * . Therefore, these two orbitals must be incorporated into the basis first. The 
assumption that the orbitals with larger n and larger / would contribute more energy, 
and, therefore, the corresponding orbitals must be added only after the orbitals 
with lower values of these numbers are filled, is not always correct, and for some 
elements orbitals with lower l and higher n contribute less energy than orbitals with 
higher l and lower n. This happens, for instance, with orbital 4s, which contributes 
less energy than the orbital 3d, but there are no simple hand-waving arguments that 
could explain or predict this behavior. Anyway, going now to the third row of the 
periodic table, you again start with the new set of orbitals characterized by n = 3, 
plenty of which are available for 11 electrons in the first element, another alkali 
metal, sodium. I think you get the gist of how it is working, but on the other hand, 
you shall be aware that this line of arguments is still a gross oversimplification, and 
periodic table of elements is not that periodic in some instances, and there are lots 
of elements that do not fit this simple model of closed shells. 

Single-electron orbitals \n, l, m,m s ) based on eigenvectors of operators of orbital 
and spin angular momenta are not the only way to characterize the ground states 


3 1 have to remind you that while the hydrogen energy levels are degenerate with respect to /, for 

other atoms this is not true because of the interaction with other electrons. 
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of atoms. An alternative approach is based on using eigenvectors of total orbital 
angular momentum L = L (sum of the orbital momenta of all electrons), 
total spin of all electrons } and grand total momentum j = i} ) + 

s m . 


Properties of the sum of two arbitrary angular momentum operators, and J^ 2 \ 
can be figured out by generalizing the results for the sum of two spins or the spin 1/2 
and the angular momentum presented in Chap. 9. The eigenvectors of the operator 


(j U) 


are characterized by quantum numbery, which can take values 


\j\ ~h\ <7 <2i +72, 


(11.23) 


where j\ and 72 refer to eigenvalues of and , respectively. For each j, 

eigenvalues of J*p + /p are characterized by magnetic numbers Mj obeying usual 


inequality 

9(2) 


< j and related to individual magnetic numbers m 71 and ntj 2 of /■ 


and J \ ; correspondingly as 


Mj = mj l + my 2 . (11.24) 

While Eq. 11.24 can be easily derived, proving Eq. 11.23 is a bit more than you can 
chew at this stage, but you may at least verify that it agrees with the cases considered 
in Chap. 9: for two 1/2 spins, Eq. 11.23 gives two values for j:j = 1,0 in agreement 
with Eqs. 9.54 and 9.55, and for the sum of the orbital momentum and the 1/2 spin, 
Eq. 11.23 yields j = l ± 1/2 again in agreement with Sect. 9.5.2. 

The transition from the description of many-fermion states in terms of single¬ 
particle orbitals to the basis formed by eigenvectors of total orbital momentum, 
total spin, and grand total angular momentum raises an important issue of separate 
symmetry properties of many-particle orbital and spin states. Consider again for 
simplicity two fermions that can individually be in orbital states |i/q) and ^ 2 ) and 
spin states |f) and ||). In the description, where spin and orbital states are lumped 
together in a one single-particle orbital (this is what I did writing equations such as 
Eq. 11.11 or 11.13), I would have introduced four single-electron orbitals \a t ): 

l“i> = l^i. t); M = 1 ^ 2 , t); N} = l^i. i ); M = \fi, i) 

and used them as a basis in a 4!/(2!2!) = six-dimensional two-fermion space. If, 
however, I preferred to use eigenvectors of the total spin of the two particles as 
a basis in the spin sector of the total spin-orbital two-particle space, separating 
thereby the orbital and spin states, I would have to make sure that both the former 
and the latter components separately possess a definite parity. Four eigenvectors of 
the total spin of two spin 1/2 particles, indeed, contain a symmetric triplet |1 ,M S ) 
of states with total = 1 (see Eq. 9.54) and one antisymmetric singlet state 

(Eq. 9.55) with total = 0. Thus, if I take these states as the spin components 
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of the total basis of two-particle fermion states, then the symmetry of the spin 
component will dictate the symmetry of the orbital portion. Indeed, to make the 
entire two-fermion state antisymmetric, the orbital components paired with any of 
the symmetric two-spin state |1 ,M S ) must itself be antisymmetric. Two available 
orbital states can only yield a single antisymmetric combination resulting in three 
basis vectors characterized by the value of total spin S^ tof} = 1: 


1 

71 


[|< ) )|^ ( 2 ) )-|^ ( 1 ) )|W 2 ) )]ii.-i> 

-E[]vf, (1) ) ^ 2 (2) )- ^ 2 (1) j ^r®)] |i,o> 

-E [|tE) v4 2) ) - f 2 l) ) l 1 - !>. 


(11.25) 


where 1 / y/l factor ensures the normalization of the vector representing the orbital 
portion of the state. The remaining total spin eigenvector corresponding to S = 0 is 
an antisymmetric singlet |0,0). Consequently, the corresponding orbital part of the 
two-particle state must be symmetric resulting in three additional possible states: 


1 


<>) 

f[ 2) )\0,0) 


*f) 

^ 2) )|0,0> 



+ 

fl ] ) 

f ( l ] ) 

|0,0) 


(11.26) 


You may notice that the first two of these states are formed by identical orbitals. 
This is not forbidden by the Pauli principle because the spin state of two electrons 
in this case is antisymmetric. This situation is often described by saying that the 
two electrons in the same orbital state have “opposite” spins, which is not exactly 
accurate. Indeed, “opposite” can refer only to the possible values of the z-component 
of spin, but those can have opposite values in the singlet state as well as in the triplet 
state with M s — 0. Thus, it is more accurate to describe this situation as a total spin 
zero or a singlet state. Combining three spin-antisymmetric states, Eq. 11.26, with 
three antisymmetric-orbital states, Eq. 11.25, you find that the total number of basis 
vectors in this representation is the same (six) as in the single-particle orbital basis, 
confirming that this is just an alternative basis in the same vector space. 

A more realistic example of a basis based on the separation of many-fermion spin 
and orbital states would include two particles and at least three single-particle orbital 
states corresponding to / = l,m = — 1,0,1. The total orbital angular momentum of 
two electrons in this case can take three values: L = 0,1,2 with the total number of 
corresponding states being 1 + 3 + 5 = 9 with various values of magnetic number 
M. To figure out the symmetry of these states, you would need to present them as a 
linear combination of single-particle states using the Clebsch-Gordan coefficients, 
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similar to what I did in Sect. 9.5.2: 

\L,h,l 2 ,M) = £ C^ m2 1/,«,) \l 2 ,m 2 )8 m2M - mi , (H-27) 

m\ni2 


where Kronecker’s delta makes sure that Eq. 11.24 is respected. The particle’s 
exchange symmetry of the states presented by \L, l\,l 2 ,M) is determined by 
the transformation rule of the Clebsch-Gordan coefficients with respect to the 
transposition of indexes l\,m\ and h,fn 2 , which you will have to accept without 
proof: 


L,l\,l 2 _ /_ -\\L—l\—l 2 /^L,l 2 ,h /11 9 o\ 

^M,m\ ,m 2 ~ \ 

Indeed, applying the exchange operator V(l, 2) to Eq. 11.27, you will see that its 
action on the right-hand side of the equation consists in the interchange of indexes 
1 1 and I2 in the Clebsch-Gordan coefficients: 

£( 1 , 2 )| L,h,h,M) = £ C L J^ m \hmi) \h,m 2 )8 mM - m2 = 

m 1 m2 

{-\) L - h ~ h £ C L J£ m2 1 hm x ) | h,m 2 )8 mM - m 

mi m2 

= (-l) L ~ l '~ l2 \L, l\,l 2 ,M). 

In the second line of this expression, I used the transposition property of 
Eq. 11.28. With this it becomes quite evident that state |L, l\,l 2 ,M) is symmetric 
with respect to the exchange of particles if L — l\ — I2 is even and is antisymmetric 
if L — l\ — I2 is odd. In the example when l\ = I2 = 1 , which I am trying to figure 
out now, this rule yields that the states with L = 2 and L = 0 are symmetric and 
the state with L = 1 is antisymmetric. Correspondingly, the latter must be paired 
with a triplet spin state, while the former two must go together with the zero spin 
state. Since the total number of single-electron orbitals in this case is 6, the expected 
number of two-particle antisymmetric basis vectors is 6!/(4!2!) = 15, and if you 
insist I can list all of them below (I will use a simplified notation | L, M) omitting l\ 
and I 2 ): 


|2,M) |0,0) 

\l,M)\hM s ) (11.29) 

| 0 , 0 ) | 0 , 0 >. 

The first line in this expression contains five vectors with — 2 < M < 2, the second 
line represents 3x3 = 9 vectors with both M and M s taking three values each, and 
finally, the last line supplies the last 15th vector to the basis. 
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Finally, to complete the picture, I can rewrite these vectors in terms of the grand 
total momentum J. The first five vectors from the expression above obviously 
correspond to j = 2, so one can easily replace this line with vectors |2 ,Mj), 
where the first number now corresponds to the value of j. The nine vectors from the 
second line correspond to three values of j: 7 = 2, 1, 0. While this situation appears 
terribly similar to the case of l\ = 1 and l 2 = 1 states considered previously, the 
significant difference is that vectors 11, M) | 1, M s ) are no longer associated with just 
one or another particle, so that Eq. 11.28 has no relation to the symmetry properties 
of the resulting states \j, 1,1 ,Mj) with respect to the exchange of the particles. 
All these states are as asymmetric under operator 7^(1,2) as states |1 ,M) |1 ,M S ). 
The last line in Eq. 11.29 obviously corresponds to a single state with zero grand 
total angular momentum, which simply coincides with |0,0) |0,0). In summary, the 

antisymmetric basis in terms of eigenvectors of operators j 2 , , (S^ , and 

J z is formed by vectors | j, L, S, Mj) : 

|2,2,0 ,Mj) , |2,1,1 ,Mj) , |1,1,1 ,Mj) , |0,1,1,0) , |0,0,0,0). (11.30) 

It is easy to check that this basis also consists of5 + 5 + 3 + l + l = 15 vectors. They 
can be expressed as linear combinations of eigenvectors of the total orbital and total 
spin momenta (Eq. 11.29) with the help of Eq. 11.27 and the same Clebsch-Gordan 
coefficients, which can always be found on the Internet. Just to illustrate this point, 
let me do it for the grand total eigenvector 11,1,1,0) using one of the tables of the 
Clebsch-Gordan coefficients that Google dug out for me in the depth of the World 
Wide Web: 


i2,u,o>=yf ii,i)ii,-i)+yfii,-i)ii,i)+y|ii, 0 )ii,o). 

Values of the total orbital, spin, and grand total momentum are often used to 
designate the electronic structure of atoms, instead of single-electron orbitals, in 
the form of the so-called term symbol: 

2 S+ 1 Lj. (11.31) 

Here the center symbol designates the value of the total orbital momentum using the 
same correspondence between numerical values and letters: S, P,D,F for 0,1,2, 3 
correspondingly similar to the single-electron orbital case but with capital rather 
than lowercase letters. The right subscript shows the value of the grand total 
momentum, and the left superscript shows the multiplicity of the respective energy 
configuration with respect to the total spin magnetic number M s . For instance, 
using this notation, the states |1,1,1 ,Mj) can be described as 3 P i, while states 
|2,2,0 ,Mj) become l D 2 . 

The example of two electrons and three available single-particle orbital states 
is more realistic than the one with only two such states, but it is still a far cry 
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from what people have to deal when analyzing real atoms. The system of only 
two electrons corresponds to helium atom, and one only needs one orbital state 
with l\ 2 = 0 to construct an antisymmetric two-electron ground state. In terms 
of eigenvectors of total angular momentum and total spin, this state corresponds to 
L = 0, S = 0: |0,0) |0,0), where the orbital component is symmetric (both electrons 
are in the same orbital state), and the spin component is antisymmetric (spins are 
in an antisymmetric singlet state). The term symbol for this state is obviously 1 So- 
Going from helium to lithium, you already have to deal with three electrons, with the 
corresponding structure in terms of single-electron orbitals shown in the first line of 
Table 11.1 . To figure out the values of the total orbital, spin, and grand total momenta 
for this element, you can start with the one established for helium atom and add an 
additional electron to it assuming that it does not disturb the existing configuration 
of the two electrons in the closed shell. Since we know that this electron goes to 
the orbital with / = 0, the total orbital momentum remains zero, and the total spin 
becomes 1/2 (you add a single spin to a state with S = 0, so what else can you 
get?), so the grand total moment becomes J = 0+1/2 =1/2, so that the term 
symbol for Li becomes the same as for hydrogen 2 S \/2 emphasizing the periodic 
property of the electronic properties of the elements. For the same reason, the term 
symbol for the next element, beryllium, is exactly the same as the one we derived 
for helium (see Table 1 1.1). To figure out the term symbol for boron, ignore the two 
electrons in the first closed shell, which do not contribute anything to the total orbital 
or spin momenta, and focus on the three electrons in the second shell. For these three 
electrons, you have available two orbitals with the same orbital state, l\ = l 2 = 0, 
and opposite spin states and an extra orbital with / 3 = 1 and £3 = 1/2. The total 
orbital and spin momenta in this case can only be equal to L = 1 and S = 1/2, 
while the grand total momentum can be either J\ = 1/2 or J 2 = 3/2. Thus, boron 
can be in one of two configurations 2 P \/2 or 2 P 3 / 2 , but so far we have no means 
of figuring out which of these two configurations have a lower energy. To answer 
this question, we can ask help from German physicist Friedrich Hermann Hund, 
who formulated a set of empiric rules determining which term symbol describes the 
electron configurations in atoms with lowest energy. These rules can be formulated 
as follows: 

1. For a given configuration, a term with the largest total spin has the lowest energy. 

2. Among the terms with the same multiplicity, a term with the largest total orbital 
momentum has the lowest energy. 

3. For the terms with the same total spin and total orbital momentum, the value of 
the grand total momentum corresponding to the lowest energy is determined by 
the filling of the outermost shell. If the outermost shell is half-filled or less than 
half-filled, then the term with the lowest value of the grand total momentum has 
the lowest energy, but if the outermost shell is more than half-filled, the term with 
the largest value of the grand total momentum has the lowest energy. 

In the case of boron, you have to go straight to the third of Hund’s rules because 
the first two do not disambiguate between the corresponding terms. Checking 
Table 11.1, you can see that the outermost shell for boron is the one characterized by 
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principal number n = 2, and the total number of single-particle orbitals on this shell 
is 8. Since boron has three electrons on this shell, the shell is less than half-filled, 
so that the third Hund’s rule tells you that the ground state configuration of boron 
is 2 P l/2 . 

The case of carbon is even more interesting. Ignoring again two electrons with 
L = 0 and 5 = 0, I focus on two p-electrons with l x = l 2 = 1. Speaking of total 
orbital momentum and total spin, you can identify the following possible values for 
L and S: L = 0,1,2 and 5 = 0,1. However, one needs to remember that the overall 
state, including its spin and orbital components, must be antisymmetric, so that not 
all combinations of L and 5 are possible. For instance, you already know that L = 2 
orbitals are all symmetric; therefore, they can only coexist with spin singlet 5 = 0. 
The corresponding grand total momentum is / = 2, so that the respective term is 
X D 2 . The state with total orbital momentum L = 1 is antisymmetric and, therefore, 
demands a symmetric triplet spin state 5=1. This combination of orbital and 
spin momenta can generate grand total momentum / = 2,1,0, so that we have the 
following terms: 3 P 2 , 3 /fi, 3 /V Finally, symmetric L = 0 state must be coupled with 
the spin singlet giving rise to term l So- In summary, I identified five possible terms 
consistent with the antisymmetry requirement: l D 2 , 3 P 2 , 3 Pi 3 P 0 , and ^o- Using the 
first two Hund’s rules, you can limit the choice of the ground state configuration to 
the P states, and since the number of electrons in C atom on the outer shell is only 4, 
it is half-filled, and the third Hund’s rule yields that the ground state configuration 
for carbon is 3 Po. Figuring out term symbols for elements where there are more than 
two electrons in the incomplete subshell (orbitals with the same value of the single¬ 
particle orbital momentum), such as nitrogen (three electrons on the /?-subshell), is 
more complex, so I give you the term symbols for the rest of the elements in the 
second row of the periodic table in Table 11.1 without proof for you to contemplate. 


11.4 Exchange Energy and Other Exchange Effects 
11.4.1 Exchange Interaction 

Some of the examples discussed in the previous section have already demonstrated 
a weird interconnectedness between spin and orbital components of many-particle 
states, which has nothing to do with any kind of real spin-orbital interaction. 
Recall, for instance, Eqs. 11.25 and 11.26 for two-fermion states: the triplet spin 
state in Eq. 11.25 requires asymmetric orbital state, while the singlet spin state in 
Eq. 11.26 asks for the orbital states to be symmetric. In the absence of interaction 
between electrons, all three 5=1 states are degenerate and belong to the energy 
eigenvalue E\ + E 2 , where Ei t2 are eigenvalues of the single-particle Hamiltonian 
corresponding to the “occupied” orbital states. At the same time, 5 = 0 states 
correspond to three different energies 2E \, 2E 2 , and E\ + E 2 , depending upon the 
orbital components used in their construction. The E\ +E 2 energy level is, therefore, 
fourfold degenerate, with the corresponding eigenvectors formed by symmetric and 
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antisymmetric combinations of the same two orbital functions and |^ 2 ). It 
is important to emphasize that three of these degenerate states correspond to the 
total spin of the system S = 1, and the fourth one possesses total spin S = 0. 
An interaction between electrons, however, might lift the degeneracy, making the 
energy of a two-electron system dependent on its spin state even in the absence of 
any actual spin-dependent interactions. This is yet another fascinating evidence of 
the weirdness of the quantum world. 

I will demonstrate this phenomenon using a simple spin-independent Coulomb 
interaction potential: 


V(l,2) = 


4ns 0 |n — r 2 | 


which describes the repulsion between two electrons in the atom of helium and is 
added to an attractive potential responsible for the interaction between the electrons 
and the nucleus. While a mathematically rigorous solution of a quantum three- 
body problem is too complicated for us to handle, what I can do is to compute the 
expectation value of the potential V(l, 2) using eigenvectors of the non-interacting 
electrons. As you will find out later in Chap. 13, such an expectation value gives 
you an approximation for the interaction-induced correction to the eigenvalues of 
the Hamiltonian. 

Let me begin with the two-fermion state described by the vector presented 
in Eq. 11.25, which is characterized by an antisymmetric orbital component. The 
interaction potential does not contain any spin-related operators, allowing me to 
ignore the spin component of this state (it will simply yield (1, M s | 1, M s ) = 1) and 
write the expectation value as follows: 

('-■H.2)) = 

l *?\(<’| - Ki(tf >|] V(l,2) [\f {")]*?) - |*«'>) |* «)] = 


1 r 

2 


{^r 2 2) [xfr i (1) V(l,2) V^i (1) ) |V4 2) ) + (^i (2) 2 r) V(1,2) ~ 


(11.32) 


1 r 

2 


(l/q (2) (l//, 


4 l) 


V(l,2) 


Vr (1) ) f?) + (v^ 2 (2) (v^i (1) v( 1,2) iAi (2) ) • 

(11.33) 


If you carefully compare the terms in the third and fourth lines of the expression 
above, you will notice a striking difference between them. In both terms in the 
third line (Eq. 11.32), the ket and bra vectors, describing any of the two particles, 


represent the same state ( 


j and ^ 2 2) ) an( ^ 




), while the ket and bra 


vectors of the same particle in the fourth line (Eq. 11.33) correspond to different 


states ( 


and an( ^ 


). In other words, the terms in the line 
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labeled as Eq. 11.32 look like regular single-particle expectation values, while the 
terms in the next line look like non-diagonal matrix elements computed between 
different states for each particle. You can also notice that the two terms in Eq. 11.32 
can be transformed into the other by exchange operator 7^ (1,2). Since the particles 
are identical, no matrix elements must change as a result of the transposition, which 
means that these terms are equal to each other. If you, however, apply the exchange 
operator to the terms in Eq. 11.33, you will generate expressions, where ket and 
bra vectors are reversed, meaning that these terms are complex conjugates of each 
other. Finally, you can easily see that the expression in Eq. 11.32 would have exactly 
the same form even if the particles in question were distinguishable, while Eq. 11.33 
results from the antisymmetrization requirements imposed on the two-electron state. 

Taking all this into account, the interaction expectation value can be presented as 

(V(l,2)) = V c + V exc-> (11.34) 


where Vc is defined as 


and V, rr as 


y c = (^ 2) |(^ 1) |y(l,2)|^ 1 (1) )|^ 2 (2) ) 


V PXC = -Re 


[K’K 


u 


( 1 ) 




V f i (1) ) v4 2) ) 


Using the position representation for the orbital states, the expression for Vc can be 
written down in the explicit form 


Vc 



f f , ;3 l^i(n )| 2 Wi(ri )\ 2 

r r 'j dn |„-r,| ■ 


(11.35) 


which makes all statements made about Vc rather obvious. If you agree to identify 
e \ ty(r)\ 2 with the charge density, you can interpret Eq. 11.35 as a classical energy 
of the Coulomb interaction between two continuously distributed charges with 
densities e | i/q (r) | 2 and e \ fa (r) | 2 . 

The expression for V exc in the position representation takes the form 


VPXC - 


4tT£o 


-Re 


cPr\ 


! 


3 

a V 2 - 


ki -r 2 1 


(11.36) 


which does not have any classical interpretation. This contribution to the energy is 
called exchange energy , and its origin can be directly traced to the antisymmetriza¬ 
tion requirement. The expectation value computed with the symmetric orbital state 
would have the same form as in Eq. 11.34, with one but important difference—a dif¬ 
ferent sign in front of the exchange energy term. Thus, previously degenerate states 
are now split by the interaction of the amount equal to 2V exc on the basis of their 
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spin states. Just think about it—in the absence of any special spin-orbit interaction 
term in the Hamiltonian, the energies of the two-electron states composed of the 
same single-particle orbitals depend on their spin state! This is a purely quantum 
effect, one of the manifestations of the oddity of quantum mechanics, which has 
profound experimental and technological implications. However, first of all, I want 
you to get some feeling about the actual magnitude of this effect; for this reason, I 
am going to compute the Coulomb and exchange energies for a simple example of 
a two-electron state of the helium atom. 

For concreteness (and to simplify calculations), I will presume that the orbitals 
participating in the construction of the two-electron state are |1,0, 0) and |2,0,0), 
where I used the notation for the states from Chap. 8. In the position representation, 
the corresponding wave functions are = R\o(r \)/and ^ 2 (^* 2 ) = 

^ 20 ( 72 )/ V / 47r, where R\o and R 20 are hydrogen radial wave functions, and the factor 
I/VAtv is what is left of the spherical harmonics with zero orbital momentum. 
When integrating Eq. 11.35 with respect to iq, I can choose the Z-axis of the 
spherical coordinate system in the direction of 7 * 2 , in which case the denominator 
in this equation can be written down as 

h -r 2 1 = y / rf + r|-2r 1 r 2 cos ~Q\. 

The integral over r\ now becomes 

OO 71 2 tz 

32 r r r e -^n/a B 

I(r 2 ) = —5" / dr\ / dOi / d(pi ffsinflt = 

Ana e { J 0 { yr2 + r 2 2 -2r 1 r 2 cos0 1 

OO 1 

dnr\e~ AnlaB f dx / ' 

-1 y r \ + r 2 -2nr 2 x 

where I substituted 



Rw = 2(2/a B ) 3/1 exp(-2r/a B ) 


(remember that Z = 2 for He). Integral over v yields 


1 

r\r 2 


1 

/ 


dx- 


2r\r 2 

= ~ f 

2 nr 2 J 


dz 


-1 1 /r 2 + r 2 -2r 1 r 2 x 2nr2 J nn ft 

y / r 2 + r 2 + 2r 1 r 2 - 


rj + z 


r\+r 2 - |n - r 2 | 


nr 2 


(11.37) 
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Evaluating this expression separately for r\ > r 2 and r\ < r 2 ,1 find for I(r 2 ) 


128 / 
Hri) = -3— / 

dr l r\e~ a,n,aB + 

[ d 

a B r 2 J 

a B 

J 

0 


n 



4 " 



r 2 


-4n/a B _ 


Now, using 


R 


20 




= 2 f “V' 2 f, — ex p ■ 

\a B J V a Bj V a Bj 


-4r 2 /a B 


I get for Vc 


V c = 


8 


OO 71 2 tz 


(11.38) 


4jt£o 4jta 3 B 


j dr 2 j dd 2 j dcp 2 sin 0 2 r^I{r 2 ) ^1 — — ^ exp ^ = 


0 0 


e 2 32 
4jt£q al 


00 ? 

[ dr 2 r 2 |"l - (1 + —) e- 4r2/a A (1 - — ) exp = 

J L V a B J JV a Bj V a B ) 


272 e 2 
81 471^0^5 


= 3.3 5Ry ^ 46.34 eV 


where I used Eq. 8.17 with Z set to unity and notation Ry = 13.8 eV for hydrogen’s 
ground state (in vacuum). 

Now, I will compute the exchange energy correction. Keeping the same notation 
7 ( 7 * 2 ) for the first integral with respect to r\ , I can present it, using expressions for 
the radial functions provided above, as 


o /o °p n r 2 r (l — — ) exp ) 

h(r 2 ) = J dr ' J J d(p\r\ sin 6\- 

000 


4jta 3 B 


^r\ + rj- 2r\r 2 cos 0\ 


4x/2 




1 yjr\ + 4 - 2 ri r 2 x 


Equation 11.37 for the angular integral and Mathematica © for the remaining radial 
integrals yield 
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Plugging it into Eq. 11.36 and dropping the real value sign because all the functions 
in the integral are real, I have 


Vexc — 


4tt£o 27 as 


8V2 1 2 \ 3/2 ( 1 \ 3/2 

4 uJ w : 


LXJ 

[dr 2 rjex p 6 - —) ex P (“— ) f 1 + ex P (“— ) = 

J V a B J V a Bj \ a B )\ a B J \ a B ) 

0 

if 4(' - S) (' + v B ) “ -'- 21eV - 


Thus, this calculation showed that a state with S = 1 has a smaller energy than 
a state with S = 0 by 2 x 1.21 = 2.42 eV. However, the sign of the integral 
in the exchange term depends on the fine details of wave functions representing 
single-electron orbitals and is not predestined. If single-particle orbitals of electrons 
were different, describing, for instance, electrons with non-zero orbital momentum 
in outer shells of heavier elements, or electrons in metals, the situation might have 
been reversed, and the singlet spin state could have a lower energy than a triplet. 

This difference between energies of symmetric and antisymmetric spin states 
gives rise to something known as an exchange interaction between spins and plays 
an extremely important role in magnetic properties of materials. In particular, 
this “interaction,” which is simply a result of the fermion nature of electrons, is 
responsible for the formation of ordered spin arrangements responsible for such 
phenomena as ferromagnetism or antiferromagnetism. 

Ferromagnets—materials with permanent magnetization—have been known 
since the earliest days of human civilization, but the origin of their magnetic 
properties remained a mystery for a very long time. Andre-Marie Ampere (a French 
physicist who lived between 1775 and 1836 and made seminal contributions to 
electromagnetism) proposed that magnetization is the result of the alignment of 
all dipole magnetic moments formed by circular electron currents of each atom 
in the same direction. This alignment, he believed, was due to the magnetostatic 
interaction between the dipoles, which made the energy of the system lowest when 
all dipoles point in the same direction. Unfortunately, calculations showed that the 
magnetostatic interaction is so weak that thermal fluctuations would destroy the 
ferromagnetic order even at temperatures as small as a few Kelvins. The energy of 
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the spin exchange interaction is much bigger (if you think that 2 eV is a small energy, 
you will be delighted to know that it corresponds to a temperature of more than 
20,000 K). The temperature at which iron loses its ferromagnetic properties due to 
thermal agitation is about 1043 K, which corresponds to the exchange energy of only 
80meV, which is the real culprit behind the ordering of spin magnetic moments. In 
the classical picture, ordering would mean that all magnetic moments are aligned 
in the same direction, but describing this phenomenon quantum mechanically, we 
would say that N spins S are aligned if they are in the symmetric state with total 
spin equal to NS. For such a state to correspond to the ground state of the system 
of N spins, the exchange energy must favor (be lower for) symmetric states over the 
antisymmetric states. An antisymmetric state classically could be described as an 
array of magnetic moments, each pair of which is aligned in the opposite directions, 
while quantum mechanically we would say that each pair of spins in the array is in 
the spin zero state. Materials with such an arrangement of magnetic moments are 
known as antiferromagnetics, and for the antiferromagnetic state to be a ground 
state, the exchange energy must change its sign compared to the ferromagnetic 
case. The complete theory of magnetic order in solids is rather complicated, so 
you should not think that this brief glimpse into this area gives you any kind of 
even remotely complete picture, but beyond all these complexities, there is a main 
underlying physical mechanism—the exchange energy. 


11.4.2 Exchange Correlations 

The symmetry requirement on the many-particle states of indistinguishable particles 
affects not only their interaction energy but also their spatial positions. To illustrate 
this point, I will compute the expectation value of the distance between two electrons 
defined as 


{in - r 2 ) 2 j = (4) + (4) - 2 (nn) • (11.39) 

This time around, I will assume that the two electrons belong to two different 
hydrogen-like atoms separated by a distance R small enough for the wave functions 
describing states of each electron to have significant spatial overlap. I will also 
assume that the electrons are in the same atomic orbital | n,l,m), but since each 
of these orbitals belongs to two different atoms, they represent different states, 
even if they are described by the same set of quantum numbers. To distinguish 
between these orbitals, I will add another parameter to the set of quantum numbers 
characterizing the position of the nucleus: \n,l,m,R). If the atoms are separated 
by a significant distance, you can quite clearly ascribe an electron to an atom it 
belongs. However, when the distance between the nuclei becomes comparable with 
the characteristic size of the electron’s wave function, this identification is no longer 
possible, and you have to introduce two orbitals for each electron, \n,l,m,R\) i 
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and \n,l,m,R 2 )i , where lower index i outside of the ket symbol takes values 1 
or 2 signifying one or another electron. This two-electron system can be again in 
a singlet or triplet spin state demanding symmetric or antisymmetric two-electron 
orbital state: 


i±> 


V2 


[|ft, l, m,R\) l \n, /, m, ^2)2 =b |ft, L ft?,^i) 2 I??, U ft?, ^2)1] • 


(11.40) 


The first two terms in Eq. 11.39 are determined by single-particle orbitals: 


(=b| r\ | =b) = - [1 (ft, /, ft?, R\ \ r\ \n, l,m,R\) l + 1 (ft, /, ft?, R 2 \ r\ \n, l,m,R 2 )i =b 

1 ( n , /, m, R\ | r\ \n, /, ft?, R 2 ) 1 x 2 (ft, /, ft?, R\ \ n, /, m , ^ 2)2 ± 

1 (ft, /, m,7?2| r i |ft, /, x 2 (ft, /, m,R 2 \ ft, /, m,7?i) 2 ]. 


When writing this expression, I took into account that the orbitals belonging to the 
same atom are normalized, 2 (ft,/,???, ^ 2 1 ft,/, ft?, ^ 2)2 = 1, but orbitals belonging 
to different atoms are not necessarily orthogonal: 2 (ft, /, m, R\ \ n, /, m, ^ 2)2 7 ^ 0 • 
Similar expression for (zb| r 2 |zb) is 

(zb| r\ |zb) = - [2 (ft, /, m,7?i| r 2 |ft, l,m,R\) 2 + 2 (ft, /, m,7? 2 | r 2 |n, /,???, ^ 2)2 =b 
2 (ft, /, ft?, | r 2 |ft, /, m, ^ 2)2 x 1 (??, /, m, R\ \ ft, /, ft?, /? 2 ) 1 =b 

2 (ft, /, ft?, ^ 21 r 2 K /,???, ^ 1)2 x 1 (??> ^ ft?, ^ 21 ft, /, ft?, ^ 1 ) 1 ] • 


Since both atoms are assumed to be identical, the following must be true: 

1 (ft, /, ft?, | |ft, /, ft?, j =2 (ft, /, ft?, ^2 1 ^2 K I * m »^2)2 = < 2 2 

1 (ft, /, ft?, 7?21 rj |ft, /, m, 7? 2 )! = 2 (ft, /, ft?, ^ 1 1 r 2 K ^ ???, ^1 )2 = /? 2 

1 (ft, /, ft?,/^il rj |/?, /, m, ^ 2)1 = 2 (ft, /, m,7^21 ?2 K /, m,R\) 2 = u 

2 (n,l,m,R\ \ ??, /, ???, ^ 2)2 — 1 (ft, /, ft?, /? 2 | ft, /, ft?,/?i)i = ft. 


All these relations can be obtained by noticing that the system remains unchanged 
if you replace R\ ^ R 2 and simultaneously change electron indexes 1 and 2. Taking 
into account these relations and corresponding simplified notations, I can write for 
(±1 r l ,2 l ±> : 


(±1 r i l±) = (±1 r\\±) = - [a 2 +b 2 ±(uv + u* u*)]. 
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The next step is to evaluate (r\r 2 ): 


(±|rir 2 |±> = 


- [i (n, /, m,R\ | r\ \n, /, m,R\) l x 2 (ft, /, m,R 2 \r 2 \n, /, m,R 2 ) 2 + (11.41) 

i (ft, /, m,7?2| ri |ft, /, m,/? 2 )i x 2 (ft, /, m,7?i | 7*2 |ft, /, m,Ri) 2 ± 

1 (ft, /, m,7?i | ri |ft, /, m,R 2 ) 1 x 2 (ft, /, m,7?i | r 2 |ft, /, m,R 2 ) 2 ± 


1 (n, /, ri |w, /, x 2 (^, /, m,R 2 \r 2 \n, l, m,Ri ) 2 ]. (11.42) 


The evaluation of these expressions requires a more explicit determination of the 
point with respect to which electron position vectors are defined. Assuming for 
concreteness that the origin of the coordinate system is at the nucleus of atom 1,1 can 
immediately note that the symmetry with respect to inversion kills first two terms 
in Eq. 11.42 since 1 ( n, l, m, R\\r\\n,l,m,R\) x = 2 (n, /, m, R\ \r 2 \n, l,m, R\) 2 = 0. 
The remaining two terms survive and can be written as 


(±ki r 2 |±) = ± \d\ 2 


where I introduced vector d defined as follows: 

1 (n, l, m, R\ | r\ \n,l,m,R 2 ) l = 2 ( n,l,m,R 2 \r 2 \n, l,m, R\) 2 = d. 
Finally, combining all the obtained results together, I can write 



(11.43) 


While the actual computation of matrix elements appearing in Eq. 11.43 is rather 
difficult and will not be attempted here, you can still learn something from this 
exercise. Its main lesson is that the spin state of the electrons affects how close the 
electrons of the two atoms can be. Assuming for concreteness that the expression 
in the parentheses in Eq. 11.43 is negative, which is favored by the term \d \ 2 (the 
actual sign depends on the single-electron orbitals), one can conclude that the 
antisymmetric spin state promoting symmetric orbital state (+ sign in d=) results 
in electrons being closer together, than in the case of the symmetric spin state. This 
is an interesting quantum mechanical effect: electrons appear to be “pushed” closer 
toward each other or further away from each other depending on their spin state even 
though there is no actual physical force doing the “pushing.” This phenomenon plays 
an important role in chemical bonding between atoms, because electrons, when 
“pushed” toward each other, pull their nuclei along making the formation of a stable 
bi-atomic molecule more likely. 
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11.5 Fermi Energy 

The behavior of systems consisting of many identical particles (and by many here I 
mean really huge, something like Avogadro’s number) is studied by a special field of 
physics called quantum statistics. Even a sketchy review of this field would take us 
well outside the scope of this book, but there is one problem involving a very large 
number of fermions, which we can handle. The issue in question is the structure 
of the ground state and its energy for the system on N » 1 non-interacting free 
electrons (an ideal electron gas) confined within a box of volume V. Each electron is 
a free particle characterized by a momentum p , corresponding single-particle energy 
E p = p 2 /2m e , and a single particle wave function (in the position representation) 
xj/p ( r ) = A p exp (ip • r/fi ), where A p is a normalization parameter, which was chosen 
in Sect. 5.1.1 to be 1/ +J2nti to generate a delta-function normalized wave function. 
Here it is more convenient to choose an alternative normalization, which would 
explicitly include volume V occupied by the electrons. To achieve this, I will impose 
so-called periodic boundary conditions: 


% (r + L) = % (r) , 


(11.44) 


where L is a vector with components L x , L y , L z such that L x L y L z = V. This 
boundary condition is the most popular choice in solid-state physics, and if you 
are wondering about its physical meaning and any kind of relation to reality, it does 
not really have any. The logic of using it is based upon two ideas. First, it is more 
convenient than, say, particle-in-the-box boundary conditions xj/p (L) = 0, implying 
that the electrons are confined in an infinite potential well, because it keeps the wave 
functions in the complex exponential form rather than forcing them to become much 
less convenient real-valued sin functions. Second, it is believed that as long as we 
are not interested in specific surface-related phenomena, the behavior of the wave 
functions at the boundary of a solid shall not have any impact on its bulk properties. 
I have used a similar idea when computing the total energy of electromagnetic field 
in Sect. 7.3.1. 

This boundary condition imposes restrictions on the allowed values of the 
electron’s momentum: 

exp (ip • (r + L) /fi) = exp (ip • r/fi ) =>- 



where i = x,y,z and n t = =bl, ±2, ±3 • • •. In addition to making the spectrum 
of the momentum operator discrete, the periodic boundary condition also allows an 
alternative normalization of the wave function: 
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L x/ 2 Ly /2 L z /2 


M / / / 

-7^-/0 7^,/0 /O 


“^ x /2 — ^>-/2 —^ z /2 


• pr pr 

e 1 * e l * dxdydz = 1 


which yields A^ = l/Vv. The system of normalized and orthogonal single-electron 
wave function takes the form 



while the single-electron energies form a discrete spectrum defined by 



(11.46) 


This wave function generates two single-electron orbitals characterized by two 
different values of the spin magnetic number m s , which is perfectly suitable for 
generating many-particle states of the Af-electron system. The ground state of this 
system is given by the Slater determinant formed by N single-electron orbitals with 
the lowest possible single-particle energies ranging from 6144 to some maximum 
value e F corresponding to the last orbital making it into the determinant. Thus, 
all single-particle orbitals of electrons are divided into two groups: those that are 
included (occupied) into the Slater determinant for the ground state and those that 
are not (empty or vacant). The occupied and empty orbitals are separated by energy 
€f known as the Fermi energy. The Fermi energy is an important characteristic of an 
electron gas, which obviously depends on the number of electrons N and determines 
much of its ground state properties. Thus, let’s spend some time trying to figure it 
out. 

In principle, finding € F is quite straightforward: one needs to find the total 
number of orbitals M(e) with energies less than e. Then the Fermi energy is found 
from equation 


M(€ F ) = N. 


(11.47) 


However, counting the orbitals and finding M{e) are not quite trivial because energy 
values defined by Eq. 11.46 are highly degenerate, and what is even worse is that 
there is no known analytical formula for the degree of the degeneracy as a function 
of energy. The problem, however, can be solved in the limit when N —> 00 and V —> 
00 so that the concentration of electrons N/V remains constant. In this limit the 
discrete granular structure of the energy spectrum becomes negligible (the spectrum 
in this case is called quasi-continuous), and the function M(c) can be determined. 

You might think that I am nuts because I first introduce finite V to make the 
spectrum discrete and then go to the limit V —> 00 to make it continuous again. The 
thing is that if I had begun with the infinite volume and continuous spectrum, the 
only information I would have had about the number of states is that it is infinite 
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(the number of states of continuous spectrum is infinite for any finite interval of 
energies), which does not help me at all. What I am getting out of this roundabout 
approach is the knowledge about how the number of states turns infinite when the 
volume goes to infinity, and as you will see, this is exactly what we need to find the 
Fermi energy. 

In order to find M(c), it is convenient to visualize the states that need to be 
counted. This can be done by presenting each single-electron orbital graphically as 
points with coordinates n\,n 2 ,n^ in a three-dimensional space defined by a regular 
Cartesian coordinate system with axes X , Y, and Z. Each point here represents two 
orbitals with different values of the spin magnetic number. Surrounding each point 
by little cubes with sides equal to unity, I can cover the entire three-dimensional 
space containing the electron’s orbitals. Since each cube has a unit volume, the total 
volume covered by the cubes is equal to the number of points within the covered 
region. Since each point represents two orbitals with opposite spins, the number of 
all orbitals in this region is twice the number of points. 

For simplicity let me assume that all L z = L y = L z = L, which allows me to 
rewrite Eq. 11.46 in the form 


n 2 (e) = n\ + n\ + n\ (11.48) 

where I introduced 

n 2 (e) = 2OTg6 " 1 " 2 ^ Z ' 2 . (11.49) 

(2 ntij 2 

Equation 11.48 defines a sphere in the space of electron orbitals with radius n cx 
L^fe. If you allow non-integer values for numbers ^ 1 , 2 , 3 , you could say that each 
point on the surface of the sphere corresponds to states with the same energy e (such 
surface is called isoenergetic). All points in the interior of the surface correspond to 
states with energies less than 6, while all points in the exterior represent states with 
energies larger than 6. Now, the number of orbitals encompassed by the surface is, as 
I just explained, simply equal to the volume of the corresponding region multiplied 
by two to account for two values of spin. Thus, I can write for the number of states 
with energies less than € 4 : 


M(€) = 2-71 



(2 m e e) 3/2 


(2m e c) 3 ' 2 

3jt 2 h 3 


(11.50) 


4 If instead of periodic boundary conditions you would use the particle-in-the-box boundary 
conditions requiring that the wave function vanishes at the boundary of the region L x x L y x L z , 
you would have ended up with p t = nrii/Li, where n t now can only take positive values because 
wave functions sin (nn\x/L x ) sin ( nn^ylLy ) sin ( nn^zlL z ) with positive and negative values of rii 

represent the same function, while function exp i (j L n\x+ ^^ 2 y + with positive and 

negative indexes represents two linearly independent states. As a result, Eq. 11.50 when used in 
this case would have an extra factor 1/8 reflecting the fact that only 1/8 of a sphere correspond to 
points with all positive coordinates. 
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Fig. 11.3 Two-dimensional 
version of a state counting 
procedure described in the 
texts: squares replace cubes, a 
circle represents a sphere, but 
the points are still the states 
specified by two instead of 
three integer numbers. The 
2-D version is easier to 
process visually, but 
illustrates all the important 
points 



The problem with this calculation is, of course, that the points on the surface do 
not necessarily correspond to integer values of n\,ri 2 , and n 3 so that this surface 
cuts through the little cubes in its immediate vicinity (see Fig. 11.3 representing 
a two-dimensional version of the described construction). As a result, some states 
with energies in the thin layer surrounding the spherical surface cannot be counted 
accurately. The number of such states is obviously proportional to the area of the 
enclosing sphere, which is oc L 2 , while the number of states counted correctly is a 
L 3 , so that the relative error of the outlined procedure behaves as 1 /L and approaches 
zero as L goes to infinity. Thus, Eq. 11.50 can be considered to be asymptotically 
correct in the limit L 00 . Now you can see the value of this procedure with the 
initial discretization followed by passing to the quasi-continuous limit. It allowed 
me to establish the exact dependence of M as a function of volume V expressed 
by Eq. 11.50. Now I can easily find the Fermi energy by substituting Eq. 11.50 into 
Eq. 11.47: 


(2 m e € F ) 

3jr 2 fi 3 


fi 2 




(11.51) 


The most important feature of Eq. 11.51 is that the number of electrons N and the 
volume they occupy V, the two quantities which are supposed to go to infinity, 
appear in this equation only in the form of the ratio N/V which we shall obviously 
keep constant when passing to the limit V —> 00 , N —> 00 . The ratio N/V 
specifies the number of electrons per unit volume and is also known as the electron 
concentration. This is one of the most important characteristics of the electron gas. 

It is important to understand that the Fermi energy is the single-electron energy of 
the last “occupied” single-particle orbital and is not the energy of the many-electron 
ground state. To find the latter I need to add energies of all occupied single-electron 
orbitals: 
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Nmax 

E = 2 ^ em.n 2 .n 3 , (11.52) 

n\,ri2,n3 

where N max is the collection of indexes n\,ri 2 , ft 3 corresponding to the last occupied 
state and the factor of 2 accounts for the spin variable. I will compute this sum again 
in the limit V —> 00 , N —> 00 (which, by the way, is called thermodynamic limit), 
and while doing so I will show you a trick of converting discrete sums in integrals. 
This operation is, again, possible because in the thermodynamic limit the discrete 
spectrum becomes quasi-continuous, and the arguments I am going to employ here 
are essentially the same as the one used to compute the Fermi energy but with a 
slightly different flavor. 

So I begin. When an orbital index n t changes by one, the change of the respective 
component of the momentum p t can be presented as 

A A 

A pi = ——Arii, 

A 

where Ant = 1. With this relation in mind, I can rewrite Eq. 11.52 as 


N m ax jJ, N max 

E = 2 'y ' € nx ni n3) An\An2AnT > = 2— —j ^ ' £p x ,p y ,p z Ap x Ap y Ap z 

ni,n 2 ,n 3 { 7tTl) PxP>yP>z 

where I again set L z = L y = L z = L (remember that An t = 1, so by including 
factors Ari\ Ari 2 An?, into the original sum, I did not really change anything). Now, 
when L — 00 , Arii remains equal to unity, but Ap t —> 0, so that the corresponding 
sum in the preceding expression turns into an integral: 


V 


Px Py Pz 


(Ircfiy 


dp x dp y dp z e (p x ,Py,Pz) 


(11.53) 


where I changed the notation for the energy to emphasize that momentum is now 
not a discrete index, but a continuous variable. Since the single-particle energy 
€ (p x ,Py,Pz) depends only upon p 2 , it makes sense to compute the integral in 
Eq. 11.53 using the representation of the momentum vector in spherical coordinates. 
Replacing the Cartesian volume element dp x dp y dp z with its spherical counterpart 
p 2 sin OdOdcpdp , where 0 and (p are polar and azimuthal angles characterizing the 
direction of vector p, I can rewrite Eq. 11.53 as 


E = 2 


y 

(Infif 


PF 71 27T 


€ ( p ) p 2 sin OdOdcpdp, 


000 





11.5 Fermi Energy 


383 


where p F is the magnitude of the momentum corresponding to the Fermi energy e F . 
I proceed replacing the integration variable p with another variable e according to 
relation p = *j2m e e\ 


E = 2 V (2/Mg) ' 4 n / €s Jeck. (11.54) 

2 (2jt tif J 

0 

Before computing this integral, let me point out that it can be rewritten in the 
following form: 


E = 



where I introduced quantity 


gO) = 


V (2 m e ) 3/2 Ve 
2 jt 2 fi 3 


(11.55) 


(11.56) 


called density of states. This quantity, which means the number of states per unit 
energy interval, is an important characteristic of any many-particle system. Actually 
it is so important to give me an incentive to deviate from the original goal of 
computing the integral in Eq. 11.54 and spend more time talking about it. 

To convince you that g (e) de can, indeed, be interpreted as a number of states 
with energies within the interval [e,e + de], I will simply compute this quantity 
directly using the same state counting technique, which I used to find the Fermi 
energy. However, this time around I am interested in a number of states within a 
spherical shell with inner radius n(e) and outer radius n{e + de): 


L I - L 2m e 

n(e + de) = n(e) + de ( dn/de ) = —— J2m e e H-— J - de 

2jtn Ann V e 

where I used Eq. 11.49 for n(e). The volume occupied by this shell is 

9 9 dn 

AV = Ann dn = Ann —de = 
de 

L 2 2m e <s L f2m~ e _V(2m e ) 3/2 ^/e 
An 2 fi 2 Anti V e An 2 ti 3 

Using again the fact that the volume allocated to a single point in Fig. 11.3 is equal 
to one and that there are two single-electron orbitals per point, the total number of 
states within this spherical layer is 
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V(2m e ) 3/2 V? 

2 jt 2 fi 3 


de 


which according to Eq. 11.56 is exactly g (c) de. 

Now I can go back to Eq. 11.55 and complete the computation of the integral, 
which is quite straightforward and yields 


E = 


V(2m e ) 3/2 
2 



V(2m e ) 3/2 5/2 

5jt 2 h 2 f 


/ V(2m e ) 3/2 \ 

\3n 2 N (h 2 ) 3 ^ 2 ) 


In the last line, I rearranged the expression for the energy to make it clear (with the 
help of Eq. 11.51) that the expression in the parentheses is e F and that the ground 
state energy of the non-interacting free electron gas can be written down as 

3 

E=-N€ F . (11.57) 

This expression can also be rewritten in another no less illuminating form. Substi¬ 
tuting Eq. 11.51 into Eq. 11.57, I can present energy E of the gas as a function of 
volume: 


3 ft 1 (37 T 2 ) 2 ^ 3 A /' 5 / 3 

E ~ UM C W 3 ' 

The fact that this energy depends on the volume draws out an important point: if 
you try to expand (or contract) the volume occupied by the gas, its energy changes, 
which means that someone has to do some work to affect this change. Recalling a 
simple formula from introductory thermodynamics class dW = PdV , where W is 
work and P is pressure exerted by the gas on the walls of containing vessel, and 
taking into account that for the fixed number of particles, energy E depends only on 
volume V (no temperature), you can relate dW to —dE and determine the pressure 
exerted by the non-interacting electrons on the walls of the container as 

dE _fi 2 (3jt 2 ) 2/3 AT 5 / 3 
“ ~dV ~ 5m e y5/3" 

Thus, even in the ground state (which, by the way, from the thermodynamic point 
of view corresponds to zero temperature), an electron gas exerts a pressure on the 
surrounding medium which depends on the concentration of the electrons. The 
coolest thing about this result is that unlike the case of a classical ideal gas, this 
pressure has nothing to do with the thermal motion of the electrons because they 
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are in the ground state, which is equivalent to their temperature being equal to zero. 
This pressure is a purely quantum effect solely due to the indistinguishability of the 
electrons and their fermion nature. 


11.6 Problems 
Problems for Sect. 11.1 


Problem 140 Consider the following configuration of single-particle orbitals for a 
system of four identical fermions: 



Applying exchange operator V ( ij ) to all pairs of particles in this configuration, 
generate all possible transpositions of the particles and determine the correct signs 
in front of them. Write down the correct antisymmetric four-fermion state involving 
these single-particle orbitals. 

Problem 141 Consider the system of two bosons that can be in one of four single¬ 
particle orbitals. List all possible two-boson states adhering to the symmetrization 
requirements. 

Problem 142 Consider two non-interacting electrons in a one-dimensional har¬ 
monic oscillator potential characterized by classical frequency go. 

1. Consider single-electron orbitals \ot n ,m s ) — \ n ) \ m s) where | n) is an eigenvector 
of the harmonic oscillator and | m s ) is a spinor describing one of two possible 
eigenvectors of operator S z . Using orbitals | ot n ,m s ), write down the Slater deter¬ 
minant for the two-electron ground state, and find the corresponding ground state 
energy. 

2. Do the same for the first excited state(s) of this system. 

3. Write the two-electron states found in Parts I and II in the position-spinor 
representation. 

4. Now use the eigenvectors of the total spin of the two particles to construct the 
two-particle ground and first excited states. Find the relations between the two- 
particle states found here with those found in Parts I and II. 

5. Compute the expectation value |(zi — zi ) 1 2 3 4 5 j where zi,2 are coordinates of the two 
electrons in the states determined above. 

Problem 143 Repeat Problem 142 for two non-interacting bosons. 
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Problems for Sect. 11.3 

Problem 144 Consider an atom of nitrogen, which has three electrons in / = 1 

states. 

1. Using single-particle orbitals with / = 1 and different values of orbital and 
spin magnetic numbers, construct all possible Slater determinants representing 
possible three-electron states. 

2. Applying operators + sP + sP to all found three-particle states, figure out 
the possible values of the total spin in these states. 


Problems for Sect. 11.4 

Problem 145 Consider two identical non-interacting particles both in the ground 
states of their respective harmonic oscillator potentials. Particle 1 is in the potential 
V\ = \moo 2 x \, while particle 2 is in the potential V 2 = \moo 2 (.%2 — d ) 2 . 

1. Assuming that particles are spin 1 /2 fermions in a singlet spin state, write down 
the orbital portion of the two-particle state and compute the expectation value of 
the two-particle Hamiltonian 

H = — —j- H— mco 2 x\ H— moo 2 — d) 2 

2m e 2m e 2 1 2 v 7 

in this state. 

2. Repeat the calculations assuming that the particles are in the state with total spin 
S=l. 

3. The energy you found in Parts I and II depends upon the distance d between 
the equilibrium points of the potential. Classically such a dependence would 
mean that there is a force associated with this energy and describing repulsive or 
attractive interaction between the two particles. In the case under consideration, 
there is no real interaction, and what you have found is a purely quantum effect 
due to symmetry requirements on the two-particle states. Still, you can describe 
the result in terms of the effective “force” of interaction between the particles. 
Find this force for both singlet and triplet spin states, and specify its character 
(attractive or repulsive). 

Problem 146 Consider two electrons confined in a one-dimensional infinite 
potential well of width d and interacting with each other via potential Vi nt = 
—Eo (zi — Z 2) 2 where Eo is a real positive constant and z 1,2 are coordinates of the 
electrons. 

1. Construct the ground state two-electron wave function assuming that electrons 
are (a) in a singlet spin state and (b) in a triplet spin state. 
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2. Compute the expectation value of the interaction potential in each of these states. 

3. With interaction term included, which spin configuration would have smaller 
energy? 


Problem for Sect. 11.5 

Problem 147 Consider an ideal gas of N electrons confined in a three-dimensional 
harmonic oscillator potential: 

V = ^ m e co 2 (x 2 + y 2 + z 2 ) = ^ m e co 2 r 2 . 

Find the Fermi energy of this system and the total energy of the many-electron 
ground state. Hint: The degeneracy degrees of the single-particle energy levels 
in this case can be easily found analytically, so no transition to quasi-continuous 
spectrum and from summation to integration is necessary. 


Part III 

Quantum Phenomena and Methods 


In this part of the book, I will introduce you into the wonderful world of actual 
experimentally observable quantum mechanical phenomena. Theoretical descrip¬ 
tion of each of these phenomena will require developing special technical methods, 
which I will present as we go along. So, let the journey begin. 



Chapter 12 

Resonant Hinneling 


-*> 

Check for 
updates 


12.1 Transfer-Matrix Approach in One-Dimensional 
Quantum Mechanics 

12.1.1 Transfer Matrix: General Formulation 

In Sects. 6.2 and 6.3 of Chap. 6, 1 introduced one-dimensional quantum mechanical 
models, in which potential energy of a particle was described by a simplest 
piecewise constant function (or its extreme case—a delta-function), defining a 
single potential well or barrier. A natural extension of this model is a potential 
energy profile corresponding to several wells and/or barriers (or several delta- 
functions). In principle, one can approach the multi-barrier problem in the same 
way as a single well/barrier situation: divide the entire range of the coordinate into 
regions of constant potential energy, and use the continuity conditions for the wave 
function and its derivative to “stitch” the solutions from different regions. However, 
it is easier said than done. Each new discontinuity point adds two new unknown 
coefficients and correspondingly two equations. If in the case of a single barrier you 
had to deal with the system of four equations, a dual-barrier problem would require 
solving the system of eight equations, and soon even writing those equations down 
becomes a serious burden, and I do not even want to think about having to solve 
them. 

Luckily, there is a better way of dealing with the ever-increasing number of 
the boundary conditions in problems with multiple jumps of the potential energy. 
In this section I will show you a convenient method of arranging the unknown 
amplitudes of the wave functions and relating them to each other across the point of 
the discontinuity. 
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Let’s move forward by going back to the simplest problem of a step potential 
with a single discontinuity: 


VW = {* Z< ° dll) 

l Vi z > 0. 

where I assigned the coordinate z = 0 (the origin of the coordinate axes) to the 
point where the potential makes its jump. If I were to ask a good diligent student 
of quantum mechanics to write down a wave function of a particle with energy 
E exceeding both V 0 and V\, I would have most likely been presented with the 
following expression: 


iKz) = 


A 0 exp (ikoz) + Bq exp (- ik 0 z ) 
A x exp (ik\z) 


z < 0 
z > 0, 


( 12 . 2 ) 


where 


, y/2m e (E - Vo) 

*0 = -7- 

n 

, (E ~ VO 

k] ~ - h -' 

This wave function would have been perfectly fine if all what I were after was 
just the single step-potential problem. In this section, however, I have further- 
reaching goals, so I need to generalize this expression allowing for a possibility 
to have a wave function component corresponding to the particles propagating in 
the negative z direction for z > 0 as well as for z < 0. If you wonder where these 
backward propagating particles could come from, just imagine that there might be 
another discontinuity in the potential somewhere down the line, at a positive value 
of z, which would create a flux of reflected particles propagating in the negative z 
direction at z > 0. To take this possibility into account, I will replace Eq. 12.2 with 
a wave function of a more general form 


= 


Aq exp (ik 0 z) + B 0 exp ( -ik 0 z ) 
A\ exp (ik\z) + exp (— ik\z) 


z < 0 

z > 0. 


(12.3) 


The continuity of the wave function and its derivative at z = 0 then yields 


— Ai + B\ 

ko (Aq — Bq) = k\ (A\ — B \). 


(12.4) 

(12.5) 
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Quite similarly to what I already did in Sect. 6.2.1, 1 can rewrite these equations as 


a,_ K i+ I)' 4 o+ K i_ I) s ° <ii6> 

fii= K i “l)' 4 o+ K i+ l) s “' 02 7) 


However, for the next step, I prepared for you something new. After spending some 
time staring at these two equations, you might divine that they can be presented in a 
matrix form if amplitudes A \ t o and B\ o are arranged into a two-dimensional column 
vector, while the coefficients in front of Aq and Bq are arranged into a 2 x 2 matrix: 


Ai' 



(> + ff) 

1 

(‘-If)’ 


Ao 



J 

(■-&) 


(> + &)_ 


Bo. 


( 12 . 8 ) 


Go ahead, perform matrix multiplication in Eq. 12.8, and convince yourself that the 
result is, indeed, the system of Eqs. 12.6 and 12.7. If we agree to always use an 
amplitude of the forward propagating component of the wave function (whatever 
appears in front of exp (ik t z)) as the first element in the two-dimensional column 
and the amplitude of the backward propagating component (the one appearing in 
front of exp (— ik t z )) as the second element, I can introduce notation v 0 ,i for the 
respective columns, Z) (1,0) for the matrix 


D (1 ’0) 


fci+fcp fci—fcp 
2k\ 2k\ 

k\—kp k\ +/vq 

2k\ 2k\ 


and rewrite Eq. 12.8 as a compact matrix equation: 


Vi =D (m v 0 . 


(12.9) 


( 12 . 10 ) 


Upper indexes in the notation for this matrix are supposed to be read from right to 
left and symbolize a transition across a boundary between potentials Vo and V \. 

I will not be surprised if at this point you feel a bit disappointed and thinking: 
“so what, dude? This is just a fancy way of presenting what we already know.” 
But be patient: patience is a virtue and is usually rewarded. The real utility of 
the matrix notation becomes apparent only when you have to deal with potentials 
featuring multiple discontinuities. So, let’s get to it and assume that at some point 
with coordinate z = Zu the potential experiences another jump changing abruptly 
from V\ to V 2 . If asked to write the expression for the wave function in the regions 

between z = 0 and z = Zi and for z > zi , you would have probably written 

A\ exp (ikiz) + B\ exp (- ik\z ) 0 < z < Zi 

A 2 exp (ik 2 z) + B 2 exp (-ik 2 z) z > z\ 



( 12 . 11 ) 
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which is, of course, a perfectly reasonable and correct expression. However, if you 
tried to write down the continuity equations at z = Z\ using this wave function and 
present them in a matrix form, you would have ended up with a matrix containing 
exponential factors like exp (zbz/q^Zi) and which would not look at all like simple 
matrix Z) (1,0) from Eq. 12.9. I can try to make the situation more attractive by 
rewriting the expression for the wave function in a form, in which arguments of 
the exponential functions vanish at z = Z \: 


, ( . _ ( a[ l) exp [iki ( Z - Zi)] + b[ l) exp [-iki (z - Zi)] 0 < z < Zi 
(A ( j S) exp [iki (z - Zi)] + b\ R) exp [ik 2 (z — Zi)] z > Zi. 


( 12 . 12 ) 


This amounts to redefining amplitudes appearing in front of the respective exponents 
as you will see for yourselves when doing Problem 2 in the exercise section for 
this chapter. Please note the change in the notations: instead of distinguishing the 
amplitudes by their lower indexes (1 or 2), I introduced upper indexes L and R , 
indicating that these coefficients describe the wave function immediately to the left 
or to the right of the discontinuity point, correspondingly. At the same time, the 
lower indexes in all coefficients are now set to be 1, implying that we are dealing 
with the discontinuity at point z = Z\. In terms of these new coefficients, the 
stitching conditions take a form 

*"=H i+ IK4HK (i2i3) 

4 0 TK 4 0 + IK- <i2 - i4) 

which, with obvious substitutions ko -> k\ and k\ -> k 2 , become identical to 
Eqs. 12.6 and 12.7. These equations can again be written in the matrix form 


v[ l \ 


(12.15) 


where is formed by coefficients and B^\ r| L) —by coefficients aJ l) and 
B^\ while matrix is defined as 


Z)( 2,1 ) 


k,2~\-k\ k2~k\ 
2k 2 2k 2 

k2~k\ k2+k\ 
2k 2 2k 2 


(12.16) 


You might have noticed by now the common features of the matrices and 

Z>( 2,1 ): (a) they both describe the transition across a boundary between two values 
of the potential (Vo to V\ for the former and V\ to V 2 for the latter), (b) they both 
connect the pairs of coefficients characterizing the wave functions immediately on 
the left of the potential jump with those specifying the wave function immediately 








12.1 Transfer-Matrix Approach in One-Dimensional Quantum Mechanics 


395 


on the right of the jump, and, finally, (c) they have a similar structure, recognizing 
which enables you to write down a matrix connecting the wave function amplitudes 
across a generic discontinuity as 


£)(h-i,9 


kj+t+kj_ kj+i—kj ~ 

2*»+l 2^+i 

fei+i— kj kj+j +kj > 

_ 2^ + i 2^+i _ 


(12.17) 


where 


_ V2m e (E - V,) 

k>i — i 


is determined by the potential to the left of the discontinuity and 

4l m e (E-V i+l ) 
ki+l = - h - 

corresponds to the value of the potential to the right of it. It is also not too difficult 
to rewrite Eqs. 12.3 and 12.12 in a situation, when a jump of potential occurs at an 
arbitrary point z = Zf- 


_ _ | A® exp [ik, (z - Zi)\ + b\ l) exp [-ik t (z - z,)] 

( A® exp [iki+i (z - z,-)] + b\ R) exp [ik i+l (z - z,:)] 

Correspondingly, Eq. 12.15 becomes 


Zi -1 <Z<Zi 
Z > Zi- 

(12.18) 


= D (i+1 -° vf\ 


(12.19) 


where v iR) contains A (R} and B iR> , while v\ L) contains A® and b\ L) . 

I hope that by now I have managed to convince you that using the suggested 
matrix notations does have its benefits, but I also suspect that some of you might 
become somewhat skeptical about the generality of this approach. You might 
be thinking that all these formulas that I so confidently presented here can only 
be valid for energies exceeding all potentials Vi and that this fact strongly limits the 
utility of the method. If this did occur to you, accept my commendation for paying 
attention, but reality is not as bad as it appears. Let’s see what happens if one of 
Vi turns out to be larger than E. Obviously, in this case the respective k t becomes 
imaginary and can be written as 


_ V2m e (E - Vi) 
ki — * 


V2m e (Vi-E) . 

I - - - = IKf, 

n 


( 12 . 20 ) 
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where I introduced a new real-valued parameter 


V2 m e (V,-£) 


( 12 . 21 ) 


The corresponding wave function at z < z t becomes 

t(z) = A® exp [—AT, (z - z,-)] + B\ L} exp [k, (z - z,)] • 

The continuity condition of the wave function at z = Zi remains the same as Eq. 12.4: 

A®+fi,® =A®+5®, 

while the continuity of the derivative of the wave function yields this instead of 
Eq. 12.5 

where I assumed for the sake of argument that E > V*+i. Combining these two 
equations, I get, instead of Eqs. 12.6 and 12.7, 

A® = -A® (1 - -«-) + 1R® (l + 

B iR) = 1a® (1 + + -b < l> (1-—) , 

! 2 ' V iki+J 2 ! V iki+J’ 

which can be written again in the form of Eq. 12.19 with a new matrix 


^0+i,0 


” iki+i-Ki 

iki+ \ +K; ” 


~ki+i+iKi ki+i—iKi" 

2iki+i 

2&/+1 


2k i+l 

2k i+l 

iki+i+Ki ik i J r \—K i 


ki+i—Ki k i J r \ +iKi 

_ 2 iki+i 

2i*/+i _ 


L 2 k i+l 

2^+i J 


Comparing Z) ( * +1 withZ>^ +1 ^ in Eq. 12.17, you can immediately see that the latter 
can be obtained from the former with the simple substitution defined in Eq. 12.20. 
Therefore, you do not really have to worry about the relation between energy and 
the respective value of the potential: Eq. 12.17 works in all cases, and if k t turns out 
to be imaginary, you just need to replace it with iK t as prescribed by Eq. 12.20 (or 

just let the computer to do it for you). Consequently, matrix ^ turns out to be 
perfectly unnecessary and will not be used any more, but there is one circumstance 
which you must pay close attention to. Special significance of Eq. 12.20 is that 
when ki turns imaginary, it forces it to have a positive imaginary part (square root 
obviously allows for either positive or negative signs). As a result, the exponential 
factor at the amplitude designated as A t acquires a negative positive argument, while 
the exponential factor multiplied by amplitude B t gets a real positive argument. You 
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need to take this into account when designating corresponding amplitudes as A or B 
and placing them in the first or the second row of your column vector. (Obviously, 
it is not the actual symbols used to designate the amplitudes that are important, but 
their places in the column vector.) 

I hope that your head is not spinning yet, but as a prophylactic measure, let me 
summarize what we have achieved so far. We are considering a particle moving in 
a piecewise constant potential, which has interruptions of continuity at a number 
of points with coordinates z = Zi (the first discontinuity occurs at zo = 0). 
When crossing zu the potential jumps from V t to V t +\ . In the vicinity of each 
discontinuity point, the wave function is presented by Eq. 12.18, organized in such 
a way that coefficients with upper index L determine amplitudes of the right- and 
left-propagating components of the wave function on the left of the discontinuity 
and coefficients with upper index R determine the same amplitudes on the right of 
Zi. The connection between these pairs of coefficients is described by the matrix 
equation as presented by Eq. 12.19. 

To help you get a better feeling of why this matrix representation is useful, let 
me put together the matrix equations for a few successive discontinuity points: 

v ( * ] = D {xl) vf>\ = D (2,1) Dj L) , = D (2,1) Dq L) (12.22) 

The structure of these equations indicates that it might be possible to relate column 
vector vp* to Vq L) by consecutive matrix multiplication if we had matrices relating 
vP to v[ R \ vp 1 to Vq R \ and, in general, v- L) to v ( p v To find these matrices, I have 
to take you back to Eq. 12.18, where you shall notice that the pairs of coefficients 
Ap { , Bp\ and a\ L) , b\ L) describe the wave function defined on the same interval 
Zi < z < Zi+ 1 . Accordingly, the following must be true: 


A® exp [ikj (z - z,-)] + B ( a) exp [-ik; (z - z,-)] = 
A« exp [ik, (; z - Z;-i)] + exp [-ik, (z - z,—1)] 
which is satisfied if 

a\ l) exp [ik, (Z - Zi)] = A< i-\ exp [ik; (z - z,-i)] 


and 


B; £) exp [-ik; (z - z,)] = B® exp [- ik , (z - z,--i)] 
Canceling the common factor exp (ik#), you find 

A- L) = exp [ik; (zi - Z;-i)] A^’ 
b ] L> = exp [-ik; ( zt - Zi—i)] B® 


(12.23) 

(12.24) 
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which can be presented in the matrix form as 


~A?~ 

- 

1 

3 

"ccf 

_ 


exp [ikj (zi - Zi—\ )] 0 

0 exp [-ik t (zi - 2,-1)]. 


A i -1 
d(^) 
1 


Introducing the diagonal matrix 
M® = 

I can give Eq. 12.25 the form 


exp [iki ( Zi ~ Zi- 1)] 0 

0 exp [—/£*• (zj — Z;-i)] 


(12.25) 


(12.26) 




(12.27) 


which you can recognize as the missing relation between v- L) and . Note that 
the upper index in signifies that it corresponds to the region of coordinates 
Zi- 1 < z < Zi , where the potential is equal to Vi. It is important to note that Eq. 12.26 
can be used even if ki turns out to be imaginary. All you will need to do in this case 
is to replace k t with fiq- according to Eqs. 12.20 and 12.21. Now, complimenting 
Eq. 12.22 with the missing links, you get 

= D (xl) vf h , vf> = M (2 ) v[ r) ; v ( r) = D (2A) vf\ (12.28) 

=M m v (R) ; =D (2 ’ 1 ) v { q\ 

which, after combining all successive matrix relations, yields 

V {R) = D i3 ’ 2) M i2) D {2 ' l) M (l) D (m v ( 0 L) . (12.29) 


This result illuminates the power of the method, which is presented here: the ampli¬ 
tudes of the wave function after the particles have encountered three discontinuity 
points of the potential are expressed in terms of the amplitudes specifying the wave 
function in the region before the first discontinuity via a simple matrix relation, 
v(R) _ j(3) v ( L )^ w h ere matrix T^\ called the transfer matrix, is the product of five 
matrices of two different kinds: 


j(3) _ £)0,2)jyj(2) £)( 2 ’1)^"(1)_£)(1’0). 


Matrices Z)^ +1 ^ can be called interface matrices as they describe transformation of 
the wave function amplitudes due to crossing of an interface between two distinct 
values of the potential, and you can use the name “free propagation matrices” 
for M® because they describe the evolution of the wave function due to free 
propagation of the particle between two discontinuities. Equation 12.29 has a 
simple physical interpretation if you read it from right to left: a particle begins 
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Fig. 12.1 A potential profile V 

corresponding to Eq. 12.29 



with a wave function characterized by column vector v^. It encounters the first 
discontinuity at z = Z\ , and upon crossing it the wave function coefficients undergo 
transformation prescribed by matrix Z> ( L0) . After that the wave function evolves 
as it were for a free particle in potential V \—this evolution is described by the 
propagation matrix M (1) . The crossing of the boundary between V\ and V 2 regions is 
represented by the interface matrix Z) (2,1) and so on and so forth. One of the possible 
potential profiles that could have been described by Eq. 12.29 is shown in Fig. 12.1. 

Equation 12.29 is trivially generalized to the case of an arbitrary number, N, of 
the discontinuities, located at points zi, i = 0,1,2 • • • N — 1 with zo = 0: 

= 7’ (a °d® (12.30) 

with a corresponding transfer matrix defined as 

T (n) = D (n ’ n - ]) M ( n - ]) • • •D (2 ’ 1) M (1) D (1 ’ 0) . (12.31) 

Once the transfer matrix is known, you can use it to obtain all the information about 
wave functions (and corresponding energy eigenvalues when appropriate) of the 
particle in the corresponding potential both in the continuous and discrete segments 
of the energy spectrum. The next section in this chapter discusses how this can be 
done. 


12.1.2 Application of Transfer-Matrix Formalism to Generic 
Scattering and Bound State Problems 

12.1.2.1 Generic Scattering Problem via the Transfer Matrix 

Having defined a generic transfer matrix F N) , I can now solve a typical scattering 
problem similar to the one discussed in Sect. 6.2.1. Setting it up amounts to 
specifying the wave function of the particle at z < 0 (before the particle encounters 
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the first break of the continuity) and at z > zn-i (after the particle passes through 
the last discontinuity point). The scattering wave function introduced in Sect. 6.2.1 

exp (ikoz) + r exp (~ik 0 z), z < 0 (12 32) 

rexp (ik N z.) z> Zn- i 

is in the transfer-matrix formalism described by column vectors Vq and V/y: 



v 


C L ) _ 

0 — 


1 

r 


v 


(R) 
N -1 


t 

0 


(12.33) 


Presenting the generic T-matrix by its (presumably known) elements 


y(A0 — 


t\\ t \2 
h\ hi. 


I can rewrite Eq. 12.30 in the expanded form as 


V 


hi 

T 

_o_ 


Jn hi_ 

r 


This translates into the system of linear equations: 


t — t \i + rt 12 

0 = ^ 21 + Thl- 


From the second of these equations, I immediately have 

r _ _h\ 

hi 

and substituting this result into the first one, I find 



hi hi 


(12.34) 


(12.35) 


Here det = t n t 2 2 — 62^21 is the determinant of the T-matrix T^ N \ which, 

believe it or not, can actually be quite easily computed for the most general transfer 
matrix defined in Eq. 12.31. 

To do so you must, first, recall that the determinant of the product of the matrices 
is equal to the product of the determinants of the individual factors: 
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det ( 't (N) ) = det (p( N ' N ~ 1 ^ det (M (iV_1) ) 
det (z> (21) ) det (m (1) ) det (l> (1 ’ 0) ) . 


(12.36) 


It is easy to see that det (a/®) = 1 for any i, so all these factors can be omitted 
from Eq. 12.36 yielding 

det (t ov> ) = det (d {N - n ~ [) ) det (d { n -'- n ~ 2) ) • • • x 

det (D^) det (d (1 - 0) ) . (12.37) 


Now all I need is to compute the determinant of the generic matrix ,l \ Using 
Eq. 12.17, 1 find 


det 


K 1 ") = (cf t 


+ k[ 


2ki +1 


/ kj+1 - kj 

V 2*i +1 


2 


kj 

ki+ i ’ 


which leads to the following expression for 



det(T^ N A — ^ N ~ X ^ N ~ 2 

V / kjy—i k2 k\ k]y 


(12.38) 


Isn’t it amazing how all ki in the intermediate regions got canceled, so that the 
determinant depends only upon the wave numbers (real or imaginary) in the first 
and the last region of the constant potential. Using this result in Eq. 12.35, 1 can find 
a simplified expression for the transmission amplitude 


kN hi 


(12.39) 


which becomes even simpler if the potential for z < 0 and for z > Zn -i is the 
same. In this case the determinant of the transfer matrix becomes equal to unity 
and t = \/hi • Having found r and t , I can restore the wave function in the entire 
range of the coordinate by consequently applying interface and propagation matrices 
constituting the total transfer matrix T {N) . 

With help of Eq. 6.53 from Sect. 6.2.1, 1 can also find the corresponding reflection 
and transmission probabilities: 


R = \r\ = 


T = k h t \ 2 = 

K 0 
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Fig. 12.2 An example of a 
potential with discrete 
spectrum 



V,- 




V A 



z 


K 


where I used Eq. 12.39 for t. Since reflection and transmission probabilities must 
obey the condition R + T = 1, it imposes the following general condition on the 
elements of the transfer matrix: 


\tn\ 2 — \tn\ 2 = 1. 


12.1.2.2 Finding Bound States with the Transfer Matrix 

Now let me show how transfer-matrix method can be used to find energies of the 
bound states, if they are allowed by the potential. Consider, for instance, a potential 
shown in Fig. 12.2. After eyeballing this figure for a few moments and recalling 
that discrete energy levels correspond to classically bound motion, you shall be 
able to conclude that states with energies in the interval V 3 < E < V 4 must 
belong to the discrete spectrum. An important general point to make here is that 
the discrete spectrum in such a Hamiltonian exists at energies, which are smaller 
than the limiting values of the potential at z ±00 and larger than the potential’s 
smallest value, provided that these conditions are not self-contradictory. For such 
values of energy, the solutions of the Schrodinger equation for z < 0 and z > Zn- 1 
(classically forbidden regions) take the form of real exponential functions, so that 
instead of Eq. 12.32, 1 have 


( B 0 exp (k 0 z) 

f(z) = { 

( A n exp (~k n z) 


Z < 0 

z > Zn- 1, 


(12.40) 


where 


72 m(y 0 -g) 

fl 

y2m(V w - E) 


k n — 


fl 
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Before continuing I have to reiterate a point that I already made earlier in this 
section. Equation 12.40 was obtained by making transition from parameters ko and 
k N , which become imaginary for the chosen values of energy, to real parameters 
Ko and k n with the help of Eq. 12.20. This procedure turns exponential functions 
exp (=b/fe) into exp (=F kz). Accordingly, in order to preserve the structure of my 
transfer matrices, I have to designate amplitude coefficients in front of exp (/qz) as 
B t and coefficients in front of exp (— /qz) as A t . Finally, I feel obliged to remind 
you that I discarded exponentially growing terms in Eq. 12.40 in order to preserve 
normalizability of the wave function. Thus, now, initial vectors Vo and v N , instead 
of Eq. 12.33, take the form 


v 


(L) 

0 


0 

Bo. 


V W - 

V AT— 1 - 


a n 

0 


The resulting transfer-matrix equation in this case becomes 


A n 


hi hi 

'o' 

_ 0 _ 


Jn hi_ 

Bo. 


which yields 


An — hiBo 
0 = t2iBo 


The last of these equations produces an equation for the allowed energy values, since 
it can only be fulfilled for nonvanishing Bq and Aq if 


= 0 (12.41) 

The first of these equations express An in terms of the remaining undetermined 
coefficient Bq , which can be fixed by the normalization requirement. 


12.1.3 Application of the Transfer Matrix to a Symmetrical 
Potential Well 

To illustrate the transfer-matrix method, I will now apply it to a problem, which 
we have already solved in Sect. 6.2.1 —the states of a particle in a symmetric 
rectangular potential well. To facilitate application of the transfer-matrix approach, 
I will describe this potential by function 
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(12.42) 


which differs from the one used in Sect. 6.2.1 by the choice of the origin of the 
coordinate axis for z. This potential has two discontinuity points: it changes from 
Vb to V w at zo = 0 and then, again, from V w to Vb at zi = d , where it is assumed 
that Vb > V w . Correspondingly, I need to introduce two interface matrices: D^ 1,0 * as 
defined in Eq. 12.9 with ko = yj 2 m e (E — Vb) and k\ = yj 2 m e (E — V w ) and Z) (2,1) 
defined in Eq. 12.16 with k 2 = ko. 

12.1.3.1 Scattering States 

Scattering states (continuous spectrum) of this potential correspond to energies 
E > V b , in which case parameters ko and k\ are regular real-valued wave numbers. 
Inserting the free propagation matrix M (1) from Eq. 12.26 between and 
according to Eq. 12.31 and taking into account that zo = 0 and zi = d, I obtain the 
total T-matrix 

y(2) — /)( 2 4)^(1)£)(1,0) — 


kp+k\ kp—k\ 
2ko 2kp 

kp—ki kp+ki 
2kp 2kp 



h+kp k\ -kp 
2k\ 2k\ 

k\ —kp ki+ko 
2k\ 2k\ 



{kp+k\) 2 pxp(ik\d)—(kp—k\) 2 exp(-ik\d) (kj-kp [exp(ik\d)-exp(-ik\d)] 
4kpk\ 4kpk\ 

_ (kj~kp)[zxp(ik\d)—zxp(—ik\d)\ (kp+k\) 2 exp(— ik\d)—(kp— k\) 2 zxp(ik\d) 

4kpk\ 4kpk\ 


-X 

2kok\ 

i (^q + k 2 ) sin k\ d + 2kok\ cos k\d i (k\ — &q) sin k\d 

—i (k\ — kfy sinkiJ —i ( k q + & 2 ) sink\d + 2kok\ cos kid 


(12.43) 


Substitution of the corresponding elements of the T-matrix from the last expression 
into Eqs. 12.34 and 12.39 yields the amplitude reflection and transmission coeffi¬ 
cients: 
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i (k\ — &q) smk\d 

—i (k^ + kfj sin k\d + 2kok\ cosk\d 
(k^ — kfj sin k\d 

(Jcq + kfj sin k\d + 2/^o^i cosk\d 
2&o&i 

—i (k^ + Z:^) sin&id + 2^o^i cosk\d 
2ik()k\ 

(&o + /^) sin/qd + 2/^o^i cos k\d 


(12.44) 


(12.45) 


where at the last steps, the numerators and denominators of the expressions for r and 
t were multiplied by i. The resulting expressions coincide with Eqs. 6.58 and 12.35 
of Sect. 6.2, which, of course, is not surprising. Having found the reflection and 
transmission amplitudes, I can easily restore the entire wave function. Indeed, 
substitution of Eqs. 12.45 and 12.44 into Eq. 12.32 yields the wave function for 
z < 0 and z > d. Next, using Eq. 12.10 with vq in the form 


uo = 


1 

r 


I find coefficients and b[: 


V s )' 


k\ +ko 

k\ —ko 

T 

A 0 


2k\ 

2k\ 

d(^) 

_ 0 


ki-ko 

k\+ko 

Y 


2k\ 

2k\ 

i 




= 


k\ + k 0 + r (k\ - k 0 ) 
2k\ 

k\ - ko + r (k\ + ko) 

2 h ’ 


(12.46) 

(12.47) 


which generate the wave function in the region 0 < z < d: 

h (1 + r) + k 0 (1 - r) ik k x (1 + r) - k 0 (1 - r) _ ik 

x//(z) = -—- e lKlz +-—- e lKlz . 

2k\ 2k\ 

I will leave it as an exercise to demonstrate that coefficients A K 0 J and #q in 

Eqs. 12.46 and 12.47 coincide with coefficients A 2 and B 2 in Eqs. 6.60 and 6.61 

in Sect. 6.2. Rewriting the expression for the wave function as 


iKz) = 


h (l+r)+fa(l-r) 
2k \ 


(1 + r) kg (1 r) ik]d ik] 

2k \ ’ 
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where I simply multiplied each term by exp (ik\ d) exp (—ik\d) = 1 , you can identify 
coefficients a[ l) and as 

AL) _ h + £q + r ih ~ kp) ihd 
1 2k\ 

b (D = h -kp + rjki + kp) ihd 
1 2k\ 

The same expressions for A\ L) and b\ l> can obviously be found by multiplying 
diagonal matrix M (]> by formed by coefficients A ( () K> and Bq R> . Finally, in order 
to convince the skeptics that the outlined procedure is self-consistent, you can try to 
apply the interface matrix Z> (1,2) to a[ L) and b[ L} : 


wi 


&0+&1 

1— 

1 

-j* 

H-^o+ r (^i — ko) Jhd 

A 2 


2ko 

2k 0 

2k x 



ko~h 

^0+^1 

h—ko+r(h+ko) —ihd 
2k\ 


L 2^o 

2ko J 


yielding for 

A&) _ ( k o + k\) 2 j kld + k\~ k l j k \ d 
2 Akok\ 4kpk\ 

(kp — k\) ^_ ik]d _ k\ ~kl ^_ ik]d _ 

4koki 4kpk\ 

m - r ) , + q , _ 

V 4/C! + 4k 0 2 J 

f frpq-q , + r ) _ i a 

V 4*! 4k 0 27 ' 

To continue I have to use the reflection coefficient r given by Eq. 12.44. Evaluating 

/D\ 

parts of the expression for A\ ; separately, I find 


koQ- ~ r) 
4 k, 


hi} + r) 
4ko 


(A.'q — sin^id 

(£q + &i) sin^iJ + 2ikoh cos k\d 

kp k\ sin k\d + zko cos k\d 
2 + k 2 ) sin kid + 2/koki coskid 

ki ^ (k^ — k 2 ) sinki d 

4kp (k^ + k 2 ) sin kid + 2/koki cos kid 

ki ko sin kid + ik\ cos k\d 
2 (kjj + k 2 ) sin kid + 2/kpki cos kid 


k 0 

4k[ 
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Lastly, 


*b(l 


>(1 ~r) h(l + r) 1 = 

\k x 4ko + 2 


1 2&o&i sin^i d + i (k^ + & 2 ) cos k\d ^ 

2 (/cq + & 2 ) sin J + 2/& 0 &i cos 


(fcp + fci) 2 e- , ' fc ' d 


2 + k \) sin&id + 2ikoki cos kid’ 

k\d + icosk\d with / exp (—ik\d). Similarly, 


where at the last step, I replaced sin 

*o(l - r ) hi} + 0 _ 1 _ 

At. 4£ 0 0 ~~ 


4k\ 

2koki smk x d + i (k^ + & 2 ) cos k\d 
(Icq + sin^ir/ + 2 /& 0&1 cos/qd 

1 (k 0 - h f e ,k ' d 

2 (fcg + sin^ic? + 2ikoh cos k\d 


(T>\ 

Combining all these results, I finally get A\ : 


j^(r) _ i (h + k\) 2 — (ko — k\) 2 
2 2 (k* + sin/qd + 2/& 0 &i cos k\d 

2ik$k\ 

(&o + & 2 ) sin&id + 2/&o&i cosk\d 


(12.49) 


Catching my breath after this marathon calculations (OK—half marathon), I am 
eager to compare Eq. 12.49 with Eq. 12.45 for the transmission amplitude. With a 
sigh of relief, I find that they, indeed, coincide. I will leave it as an exercise to 

/D\ 

demonstrate that B\ ; vanishes as it should. 


12.1.3.2 Bound States 

Now I will illustrate application of the transfer-matrix approach to bound states of 
the square potential well described by the same by Eq. 12.42. Discrete spectrum of 
this potential is expected to exist in the interval of energies defined as V w < E < V&. 
The transfer matrix given in Eq. 12.43 can be adapted to this case by replacing wave 
number ko with /ato, where Ko in this context is defined as 

^2 m(V b -E) 
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This procedure yields 

1 

T = -x 

2/c 0 &i 

(—/Co + kfj sinkid + 2/co^i cos k\d (k\ + /Cq) sin kid 

— (k^ + /Cq) sin kid — (— k\ + k^) sin kid + 2/coki coskid 

and Eq. 12.41 for the bound state energies takes the following form: 

2 /C 0&1 cos kid = (—/eg + ki) sin kid 


or 


tan (kid) = _ / 1 (12.50) 

K 0 "T *1 

At the first glance, this result does not agree with the one I derived in Sect. 6.2.1, 
where states were segregated according to their parity with different equations for 
the energy levels of the even and odd states. Equation 12.50, on the other hand, 
is a single equation, and the parity of the states has never been even mentioned. 
If, however, you pause to think about it, you will see that the differences between 
results obtained here and in Sect. 6.2.1 are purely superficial. 

First of all, you need to notice that the coordinates used here and in Sect. 6.2.1 
have different origins. Placing the origin of the coordinate at the center of the well 
made the inversion symmetry of the potential with respect to its center reflected in its 
coordinate dependence. Consequently, we were able to classify states by their parity. 
This immediate benefit of the symmetry is lost once the origin of the coordinate is 
displaced from the center of the well. This, of course, did not change the underlying 
symmetry of the potential (it has nothing to do with such artificial things as our 
choice of the coordinate system), but it masked it. The wave functions written in 
the coordinate system centered at the edge of the potential well do not have a definite 
parity with respect to point z = 0, and it is not surprising that my derivation of the 
eigenvalue equation naturally yielded a single equation for all energy eigenvalues. 
However, it is not too difficult to demonstrate that our single Equation 12.50 is in 
reality equivalent to two equations of Sect. 6.2.1, but it does take some extra efforts. 

First, you shall notice that trigonometric functions in Eqs.6.39 and 6.42 are 
expressed in terms of kd/ 2, while Eq. 12.50 contains tan (kid). Thus, it makes sense 
to try to express tan (kid) in terms of k\d/2 using a well-known identity 


tan (kid) = 


2 tan (kid/2) 

1 — tan 2 (kid/2) ’ 
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which yields 


tan(&i<i/2) Kok] 


1 - tan 2 (kid/ 2) —a:q + k\' 


To simplify algebra, it is useful to temporarily introduce notations v = tan (k\d/2), 
a = (—/Cq + kfj /Kok\, and rewrite the preceding equation as a quadratic equation 
for x: 

x 2 + xcj — 1 = 0 

This equation has two solutions: 



Computing cr 2 + 4 you will easily find that 



which yield the following for x\ and X 2 '. 


-Kq +k\ kI + k\ _ k\ 


2/co^i 2^i/Co /Co 

—/Cq + k\ + /c| + k\ _ /Co 
2/co^i 2^i/Co ^1 


Recalling what x stands for, you can see that one equation 12.50 is now replaced by 
two equations: 



(12.51) 


tan(/qc//2) = —. 


(12.52) 


which are exactly the eigenvalue equations for odd and even wave functions derived 
in Sect. 6.2.1. Isn’t it beautiful, really? 

Having figured out the situation with the eigenvalues, I can take care of the 
eigenvectors. The ratio of the wave function amplitudes A 2 /B 0 is given by 



(k\ + /Cq) sinkic/ 


2/coki 


(12.53) 
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while amplitudes of the wave functions in the region 0 < z < d are found from 


Ai 


D d,o) 


0 

Bo. 


Matrix D { 1 ' 0) is adapted to the case under consideration by the same substitution 
k () —> /Vo as before: 


Z>(i’ 0) 


k\ +kp k\— ijco 
2k\ 2k\ 

k\ —itcp k\ +iKQ 
2ki 2k\ 


Using this matrix, you easily find 


M 

Bi 


k\ — /Vo 
2k i 

k\ + /Vq 


Bn 


2ki 


B 0 , 


which yields the following expression for the wave function inside the well: 


VKz) = B 0 ex P 0*iz) + ~ exp (~ ik ' z ^ = 

Bo [ cos k\z + sin k\z 

V *1 

You can replace the ratio Ko/k\ in this expression with tan (k\d/ 2) or with 
— cot (k\d/2) according to Eqs. 12.51 and 12.52 and obtain the following expres¬ 
sions for the wave function representing two different types of states: 


^ = ( cos (ftV) C0S fa (z - d 7 2 )] - *o/*l = tan (M/2) 

( - sin( M/ 2 ) sin fa ^ _ ^/ 2 )] ’ *o/*i = _ cot (M/2) 

It is quite obvious now that the found wave functions are even and odd with respect 
to variable z = z — d/2, which is merely a coordinate defined in the coordinate 
system with the origin at the center of the well, just like in Sect. 6.2.1. One can also 
show that Eq. 12.53 is reduced to A 2 = ±B 0 for two different types of the wave 
function, again in agreement with the results of Sect. 6.2.1. This proof I will leave 
to you as an exercise. 
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12.2 Resonant Tunneling 

In this section I will apply the transfer-matrix method to describe an interesting and 
practically important phenomenon of resonant tunneling. This phenomenon arises 
when one considers quantum states of a particle in a potential, which consists of 
two (or more) potential barriers separated by a potential well. An example of such 
a potential is shown in Fig. 12.3. 1 am interested here in the states corresponding to 
under-barrier values of energies E: 0 < E < V. It was established in Sect. 6.2.1 
that in the case of a single barrier whose width d satisfies inequality dK 1, 
where k — yj2m e ( V — E))/ti , such states are characterized by an exponentially 
small transmission probability T a exp (—/cd), which is responsible for the effect 
of quantum tunneling—a particle incident on the barrier has a non-zero probability 
to “tunnel” through it and continue its free propagation on the other side of the 
barrier. You might wonder if adding a second barrier will result in any new and 
interesting effects. A common sense based on “classical” probability theory suggests 
that in the presence of the second barrier, the total transmission probability will 
simply be a product of transmission coefficients for each of the barriers T a 
T\ T 2 oc exp (— K\d\ — K 2 further reducing the probability that the particle tunnels 
through the barriers. However, as it often happens, the reality is more complex 
(and sometimes more intriguing) than our initial intuited insight. So, let’s see if 
our intuition leads us astray in this case. 

To simplify algebra, I will assume that both barriers have the same width d and 
height V and that they are separated by a region of zero potential of length w. This 
potential profile is characterized by four discontinuity points with coordinates 

Vo = 0 ; x\ = d; X2 = d w; V3 = 2 d + w. (12.54) 

Accordingly, the propagation of a particle through this potential is described by 
four interface matrices, D^ l,0 \ D^ 2,l \ D® ,2 \ and D^ 4,3 \ and three free propagating 
matrices M^\ M^ 2 \ and M^\ Matrices D (1,0) and Z) (2,1) are obviously identical 
to matrices Z) (3,2) and D^ 4,3 \ correspondingly, and can be obtained from those 
appearing in the first line of Eq. 12.43 by replacing & Q -> k = *Jlm e E/fi and 
k\ —> Ik = i^2m e ( V — 


D (h0) = D (3,2) = 


~ iK-\-k iK—k ~ 
2uc 2iic 

iK—k iK-\-k 

- 2 iK 2iK - 


(12.55) 


Fig. 12.3 Double-barrier 
potential 
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/)( 2 , 1 ) — £)( 4 ’ 3 ) — 


~ k-\-iK k—iK “ 
2 £ 2 £ 

A:— iK k-\-itc 


(12.56) 


2k 2k -I 

For matrices M <2 \ and M <3 \1 can write, using general definition, Eq. 12.26 
and expressions for the corresponding coordinates given in Eq. 12.54: 


M (1) = M (3) = 

M (2) 

The total transfer matrix T then becomes 


exp (— k d) 0 

0 exp (icd) 


exp ( ikw ) 0 

0 exp (—ikw) 


(12.57) 

(12.58) 


j(4) _ j^(4,3)^(3)^(3,2)^(2)^(2,1)^(1)jt)(1,0) _ 

D ( 2 , 1 ) M ( 1 ) Z) ( 1 ’ 0 ) M ( 2 )d( 2 ’ 1 )m ( 1 )Z) ( 1 ’ 0 ) ee T {2) M (2) T (2 \ (12.59) 


where T {2) is the transfer matrix describing the single barrier. I do not have to 
calculate this matrix from scratch. Instead, I can again replace ko with k and k\ 
with iK in Eq. 12.43: 


r (2) = —x 
2kk\ 


1 

2 ikK 

1 

2 ikK 


i (k 2 + k 2 ) sin k\d + 2kk\ cos k\d i (k 2 — k 2 ) smk\d 

—i (k 2 — k) sin k\d —i (k 2 + k 2 ) sin k\d + 2kk\ cos k\d 

i (k 2 — k 2 ) sin (iKd) + 2 ikK cos (iKd) —i (k 2 + k 2 ) sin (iKd) 

i (i k 2 + k 2 ) sin (iKd) —i (k 2 — k 2 ) sin (iKd) + 2 ikK cos (iKd) 

— (k 2 — k 2 ) sinh (k d) + 2 ikK cosh (a: d) (k 2 + k 2 ) sinh (k d) 

— (k 2 + k 2 ) sinh (k d) (k 2 — k 2 ) sinh (k d) + 2 ikK cosh (k d) 

(12^60) 


At the last step of this derivation, I used identities connecting trigonometric and 
hyperbolic functions: sin (iz) = i sinhz and cos (iz) = coshz. The elements of this 
matrix determine amplitude reflection and transmission coefficients for a single¬ 
barrier potential, r\ and t\ correspondingly, as established by Eqs. 12.34 and 12.39: 


2 ikK 

(k 2 — k 2 ) sinh (k d) + 2 ikK cosh (Kd) 

(k 2 + k 2 ) sinh (k d) 

(k 2 — k 2 ) sinh (k d) + 2 ikK cosh (k d) 


(12.61) 

(12.62) 
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Equations 12.61 and 12.62, obviously, can be derived from Eqs. 12.44 and 12.45 for 
the single-well problem with the same replacements of kp and k\ used to obtain the 
T-matrix itself. 

In order to simplify further computations and also to provide an easier way to 
relate the properties of the double-barrier structure to those of its single-barrier 
components, I am going use Eqs. 12.34 and 12.39 to rewrite the transfer matrix 
in terms of the amplitude reflection and transmission coefficients, r\ and t \: 


7^(2) _ 

1 22 ~ 


_J_ . 2) _ 

t ’ 1 2\ ~ 
0 


n_ 
0 ' 


Using the explicit form of the matrix T^ 2 \ Eq. 12.60, you can determine that = 
(^ 2 ?) an d T ( \2 = ^ , so that the entire can be written down as 


y(2) _ 


' i A? 

_~r\/h 


-w 

1/0 


Multiplying this by from Eq. 12.58, 1 get 


y(2) M (2) = 


exp (ikw) 11\ — exp (— ikw ) r* / 1\ 
— exp (ikw) r\ /0 exp (—ikw) /0 


i/*r 

~r\/h 1/0 


and, finally, multiplying this matrix by 7^ 2) (from the left), I find the total double¬ 
barrier T-matrix r (4) : 


y(4) _ 

1 

~]h\ 2 


exp (ikw) - 

~«r + 

_ exp(ikw)r\ 

TTi2 


exp(-ikw)\n\ 2 exp (ikw) r* 


In I 2 

exp(— 


or) 2 

expQ7nv)|n | 2 

kTF 


exp(—z&w)r* 

i/f 

exp (—ikw) 


+ 


+ In I 2 exp (-ikw) 

f* exp(—j£w)n 

Ul 


U exp(z&w)r* 

tT 


— r ] exp (z&w) — 




— exp (—ikw) r* 

-ikw) 


I 2 exp(ffcw) + ^ 


This expression can be simplified by introducing 

fi = |f| exp 0V0 
n = |r| exp 0V r ), 


which yields 



e i(kw+2(p t ) _|_ | ri | 2 ^“^ r * ^«(^w+2^) _ e ~ikw J 
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e m jV(*w+?>,) + | n I 2 <,-<•(**>-+(*)] 2ir*e i( P< sin (/few + <p t ) 

—2r\e~ i ‘ p ' cos (/few + cp t ) e~^ [- |n | 2 j 

(12.63) 


At the last step I factored out exp (icp t ) to make residual expressions more 
symmetrical with respect to the phases of the remaining exponential functions 
and used Euler’s identities cosx = (exp (pc) + exp (— ix))/2) and sinx = 
(exp (ix) — exp (—ix))/2i). Now you can simply read out the expressions for the 
total amplitude reflection and transmission coefficients: 


_ |61 2 exp QV> f ) _ 

— |ri | 2 exp (ikw + i(p ,) + exp {—ikw — i<p t ) 
r* exp (2 icpt) [exp (ikw + icp t ) — exp (—ikw — i(p t )] 
— \r\ | 2 exp (ikw + icp t ) + exp (—ikw — i(p t ) 


(12.64) 

(12.65) 


where subindex db stands for the double barrier. 

I will begin the analysis of the obtained expression with the transmission 
probability T db = \t db \ 2 : 


Tdb — 


6 - In| 2 ) cos (Aw + <p t ) - i 0 + |ri | 2 ) sin (Aw + <p t ) 


At this point it is useful to recall that transmission and reflection probabilities obey 
the probability conservation condition |li | 2 + \r \| 2 = 1, which allows to rewrite the 
expression for T db in the simplified form 


Ifi I 4 

^--2-• (12-66) 

111 | 4 cos 2 (Aw + (pt) + p + |ri | 2 ) sin 2 (Aw + <p t ) 

The corresponding expression for the reflection probability becomes 
_ 4 |n | 2 sin 2 (Aw + <p t ) 

*^db 2 

\t\ | 4 cos 2 (Aw + (pi) + (1 + |ri | 2 J sin 2 (Aw + (p t ) 


(12.67) 
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Before going any further, it is always useful to check that the results obtained obey 
the probability conservation condition Rdb + Tdb — 1. To prove that this is indeed 
true, you just need to demonstrate that 

\h | 4 + 4 |n | 2 sin 2 (kw + (p t ) = \t\ | 4 cos 2 (kw + <p t ) + ^1 + |n | 2 ^ sin 2 (kw + <p t ) . 
You might probably find an easier way to prove this identity, but this is how I did it: 

|fi | 4 + 4 |n| 2 sin 2 (kw + (p t ) = 

\ti | 4 (cos 2 (kw + (p t ) + sin 2 (kw + (p t )) + 4 |n | 2 sin 2 (kw + cp t ) = 

| h | 4 cos 2 (kw + (pt) + ^4 |n | 2 + \t\ | 4 ^ sin 2 (kw + (p t ) = 

\h | 4 cos 2 (kw + <p t ) + ^4 |n | 2 + (l - |n | 2 ) ^ sin 2 (kw + = 

In | 4 cos 2 (kw + <^) + (l + In l 2 ^) sin 2 (kw + <p t ) . ( 12 . 68 ) 

Having verified that my calculations are not obviously wrong, I can proceed with 
their analysis. If you remember that the naive expectation, which I described in the 
beginning of this section, was that adding a second barrier would result in a total 
transmission being just a product of the transmission probabilities through each 
barrier, which in our case of identical barriers would mean Tdb — |fi| 4 - Looking 

at Eq. 12.66, you can indeed notice the factor |^i | 4 in its numerator, but you will 

also see that this factor is accompanied by a denominator, which is responsible 
for breaking our naive expectations. What this denominator does, it selects special 
energies, namely, the ones obeying the condition 

^ (E) = k(E)w + cp t (E) = Tin , n = 1,2, 3 • • • , (12.69) 

which turns sin (kw + cp t ) in Eqs. 12.66 and 12.67 to zero. For energies satisfying 
Eq. 12.69, the reflection coefficient vanishes and the transmission coefficient turns 
to unity. So much for the second barrier suppressing the transmission probability! 
In reality, the presence of the second barrier somehow magically helps the quantum 
particle to penetrate both barriers without any reflection, albeit only at special 
energies. This phenomenon is called resonant tunneling , and it is a wonderful 
manifestation of importance of quantum superposition of states or, as one could 
say, of the wave nature of quantum particles. Energy values at which the resonant 
tunneling takes place are called tunneling resonances. 

To analyze this effect in more details, it is useful to rearrange terms in the 
denominator of Eq. 12.66. Using identity in Eq. 12.68, 1 can rewrite the expression 
for the transmission probability T 2 as 
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_IfT_ 

\t\ | 4 + 4 |r 1 | 2 sin 2 (kw + cp t ) 

1 

1 + sin 2 (kw + <p t ) 


(12.70) 


This expression makes it even more obvious that every time when the energy of the 
particle obeys the resonance condition, Eq. 12.69, the transmission turns to unity, 
but it also reveals the role of the parameter: 


T = 


N ' 


(12.71) 


Indeed, let me find the values of the energy for which the transmission drops to the 
half of its maximum value, i.e., becomes equal to 1/2. Quite obviously, this happens 
whenever 


A sin 2^ = i ^ | S in5*| = T/2. (12.72) 

In the case of the thick individual barriers, when the effect of the resonant 
transmission is most drastic, the single-barrier transmission \t\\ is small, while the 
reflection \r\ | is almost unity. In this case Eq. 12.71 can be approximated as follows: 

T = jM - * ’I/ » \h\ 2 (l + \h\ 2 /2) » \h\ 2 , (12.73) 

Ji ~ M 2 !-|^| 2 /2 V > 

where I neglected terms smaller than \t\\ 2 . This approximation shows that T is as 
small as \t\ | 2 meaning that according to Eq. 12.72, the value of the phase g ( E ) at the 
energy values corresponding to Tdb = 1/2 only weakly deviates from the resonant 
value E n with g (E n ) = 7tn. Accordingly, g (E) can be presented as g (E) = 7tn + 
8g n where 8g n ^ 1, allowing to simplify Eq. 12.72 as 

\sm(8g n )\^\8g n \ = T/2. (12.74) 


Thus, parameter T/2 determines the magnitude of the deviation of the phase g (E) 
from its resonant value required to bring down the transmission coefficient by half. 
The smaller the T, the smaller is such deviation, which means, in other words, that 
smaller T results in steeper decrease of transmission when particle’s energy shifts 
away from the resonance. Deviation of the phase can be translated into the respective 
deviation of energy by presenting 


S’ (£) « S’ ( En ) + 


dg(E) 


SE 


dE 
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The deviation of the phase equal to T/2 corresponds to the deviation of energy 
equal to 


2 


dg (E ) 
dE 


“I — 1 


r 

2 


(12.75) 


If one plots transmission as a function of particle energy, the resonant values will 
appear as peaks of the transmission, while parameter T E will determine the width 
of these peaks. More accurately T E /2 is called the half-width at half-maximum 
(HWHM). The origin of “half-maximum” in this term is obvious, and half-width 
refers to the fact that Eq. 12.74 has two solutions d=F/2, and the total width of 
the resonance at half-maximum is (E n + T E /2) — (E n — V E / 2) = T E . Widening 
of the barriers results in decreasing T, which can be qualitatively described as 
narrowing of the resonances. You can observe this phenomenon in Fig. 12.4, 
presenting transmission as a function of energy for several barrier widths. You can 
also see that the resonances broaden with increasing energy. This is the result of 
the energy dependence of the elements of the single-barrier transfer matrix and, 
correspondingly, of the parameters T and the derivative of the phase dg/dE. The 
explicit expression for this derivative can be found from Eq. 12.61 for the amplitude 
transmission coefficient, but the result is rather cumbersome and can be left out. 

This figure reveals that parameter T also determines how small the transmission 
becomes between the maximums and, therefore, how prominent the resonances are. 
In order to see where this effect comes from, it is useful to rewrite Eq. 12.70 for 
transmission as 


(r/2 ) 2 

(r/2) 2 + sin 2 (kw + (p t ) 


(12.76) 


One can see now that the minimum of transmission, which occurs whenever 
sin (kw + (p t ) reaches its largest value of unity, is 
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tW _ 1 _ i-2 

db “ p + i ~ 

where I assumed at the last step that T 1, i.e., it increases with increasing T. You 
may also notice that the position of the resonances is different for different barrier 
thicknesses. This result seems to be contrary to Eq. 12.69, which shows explicitly 
only the dependence of the resonant energies on the distance between the barriers, 
w. The observed effect of the dependence of the resonances on d emphasizes the 
role of the phase factor cp u which does depend on the thickness of the barriers, but 
is often overlooked. 

In the vicinity of the resonance g n = jtn, one can expand the sin (g) as 

sin (g - g n + nn) = (-1)" sin (g - g n ) ^ (-1)" (g - g n ) ^ (-1)" (dg/dE) (E - E n ) . 

With this approximation Eq. 12.76 for transmission can be presented in the vicinity 
of the resonance as 


= (Tg/2) 

(T e / 2) 2 + (E-E n ) 2 


Resonance behavior of this type occurs frequently in various areas of physics and is 
called a Breit-Wigner resonance, while Eq. 12.77 bears the name of a Breit-Wigner 
formula. 1 

The treatment of the resonant tunneling, which I have developed, is remarkably 
independent on the details of the shapes of the barriers constituting the double¬ 
barrier structure. As long as the boundaries of the barriers are clearly defined so 
that I can write down a single-barrier transfer matrix T ^ and the distance between 
the barriers, w, I can use the results of this section. Do not get me wrong—the 
parameters of T^ 2 \ of course, depend on the details of the barrier’s shape, but what 
I want to say is that T ^ can be computed independently of the double-barrier 
problem once and for all, numerically if needed, and then used in the analysis of 
the resonant tunneling. 

So, I hope you are convinced by now that the resonant tunneling is a remarkable 
phenomenon, which can be relatively simply described in terms of the reflection and 


1 Gregory Breit was an American physicist, known for his work in high energy physics and 
involvement at the earlier stages of the Manhattan project. Eugene Wigner was a Hungarian- 
American theoretical physicist, winner of the half of 1963 Nobel Prize “for his contributions to the 
theory of the atomic nucleus and the elementary particles, particularly through the discovery and 

application of fundamental symmetry principles.” In 1939 he participated in a faithful Einstein- 
Szilard meeting resulting in a letter to President Roosevelt prompting him to initiate work 
on development of atomic bombs. You might find this comment of his particularly intriguing, 
“It was not possible to formulate the laws of quantum mechanics in a fully consistent way 
without reference to consciousness,” which he made in one of his essays published in collection 
“Symmetries and Reflections - Scientific Essays (1995).” 
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transmission coefficients of a single barrier. Still, you might feel certain dissatisfac¬ 
tion because all these calculations do not really explain how passing through two 
thick barriers instead of one can improve the probability of transmission, leave alone 
make it equal to one. They also do not clarify the role of the quantum superposition, 
which, I claimed, played a crucial role in this phenomenon. There are several distinct 
ways to develop a more qualitative, intuitive understanding of this situation. First is 
naturally based on thinking about quantum mechanical properties of the particle in 
terms of waves, their superposition and interference. To see how these ideas play out, 
consider an expression for the amplitude reflection coefficient r db , which determines 
the relative contribution of the backward propagating wave in the wave function 
representing the state of the particle in the region z < 0: 

f{z) = exp (ikz) + r db exp (-ikz ). 

A careful look at Eq. 12.65 reveals that this expression describes a superposition 
of two waves, both propagating backward, but with different phases. The origin of 
these contributions is the multiple reflections of the waves representing the particle’s 
state between the boundaries of both barriers (this is why the second barrier is 
crucial for this effect to occur). The only terms contributing to the phase difference 
between them are exp (ikw + icp t ) and exp {—ikw — icp t + in), where the extra in 
in the argument of the exponent takes care of the negative sign appearing in front 
of this expression in Eq. 12.65. The phase difference between these contributions 
to the reflected (backward propagating) component of the wave function is A cj) = 
2Jew + 2 cp t + n, and if we want to suppress reflection by destructive interference, 
we must require that A 0 = n 2nn, which results in exactly the condition for the 
transmission resonance kw + cp t = nn. 

It is also instructive to take a look at the spatial dependence of the probability 
density \ty{z)\ 2 for resonant and off-resonant values of energy. The analytical 
expression for this quantity is quite cumbersome, especially off the resonance, so 
I will spare you from having to suffer through its derivation, presenting instead 
only the corresponding graphs obtained for the same values of the parameters as 
in Fig. 12.4 for off- and on-resonance values of the particle’s energy. The first two 
graphs in Fig. 12.5 correspond to the value of energy smaller and larger than the 
energy of the first tunneling resonance. In both cases you can observe oscillations of 
the probability in the region z < 0 due to interference between incident and reflected 
waves. You should also notice that the relative probability to find the particle 
in front of the barrier at the maximums of the interference pattern significantly 
exceeds the probability to find the particle between the barriers or behind them (the 
right boundary of the second barrier can be clearly identified from the graphs by 
the absence of any interference pattern in the transmitted wave) for energies both 
below and above the resonance. The situation, however, changes completely at the 
resonance (the last graph in the figure). The most remarkable feature of this graph 
is a pronounced increase of the likelihood that the particle is located between the 
barriers. If we are dealing with a beam of many electrons incident on the structure, 
this effect will result in an accumulation of electrons between the barriers making 
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Coordinate 




Fig. 12.5 Spatial dependence of the probability density \'i/s(z)\ 2 for energies below, above, and 
equal to the energy of the first tunneling resonance. Parameters of the double-barrier structure are 
the same as in Fig. 12.4 with the barrier width d = 0.4 nm 


this region strongly negatively charged. Electric field associated with this strong 
charge will repel incoming electrons making it more difficult for additional electrons 
to penetrate the barriers. This effect, called Coulomb blockade, can be noticed as 
increase in the number of reflected electrons as we increase the density of electrons 
in the beam. For very small distance between the barriers, the effect of Coulomb 
blockade can be so strong that even a single electron is capable of preventing 
other electrons from entering the structure. Thanks to this phenomenon, physicists 
and engineers gain ability to count individual electrons and develop single-electron 
devices. 

It is important to notice that the resonance probability distribution featured in 
Fig. 12.5 corresponds to the smallest of the resonance energies, which satisfies 
Eq. 12.69 with n = 1. Now I want you to take a look at the probability distributions 
corresponding to resonance energies satisfying Eq. 12.69 with n = 2 and n = 3 
presented in Fig. 12.6. 

Ignore for the second that the functions depicted in Figs. 12.5 and 12.6 do not 
vanish at infinity, and compare them to those shown in Fig. 6.6 , which present the 
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Fig. 12.6 Spatial dependence of the probability 
second and third order (n = 2,3) 



density \ty{z)\ 2 at the resonance energies of the 


wave functions corresponding to the first three bound energy levels in a rectangular 
potential. Taking into consideration the obvious difference stemming from the 
fact that the graphs in Fig. 6.6 are those of the real-valued wave functions, while 
Figs. 12.5 and 12.6 depict |^(z)| 2 , you cannot help noticing the eerie resemblance 
between the two sets of graphs. You might also notice that the resonance condition, 
Eq. 12.69, resembles an equation for the energy eigenvalues of the bound states. 
Actually, in the limit d —> oo, this equation must exactly reproduce Eq. 12.50 
with obvious replacements d —> w and V w —> 0, and it would be interesting to 
demonstrate that. Will you dare to try? Do not get deceived by the term kw in 
Eq. 12.69, which might make you think about the bound states of an infinite potential 
well. The finite nature of the potential barriers arising in the limit d —> oo is hidden 
in the phase term cp t , which shall play the main role when recasting Eqs. 12.69 into 
the form of Eq. 12.50. 

Anyway, this similarity between resonance wave functions and those of the 
bound states offers an alternative interpretation of the phenomenon of the resonant 
tunneling. Imagine that you start with a potential, in which the barriers are infinitely 
thick, so that you can place a particle in one of the stationary states of the respective 
potential well. Then, using a magic wand, you reduce the thickness of the barriers 
to a finite value. What will happen to the particle in this situation? Using what we 
learned about the tunneling effect, you can intuit that the particle will “tunnel out” 
of the potential well and escape to the infinity. In formal mathematical language, 
this can be rephrased by saying that the boundary condition for the corresponding 
Schrodinger equation at z —> ±oo must take a form of a wave propagating to the 
right (exp (ikz)) for z —> oo and a wave propagating to the left (exp (— ikz )) for 
z —^ —oo. These boundary conditions differ from the ones we used when deriving 
the transmission and reflection coefficients by the absence of the wave exp (ikz) 
incident on the potential from negative infinity. Correspondingly, the wave function 
at z < 0 and z > 2 d + w is now presented by the column vectors 
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UO 


0 



t 

0 


similar to the bound state problem. Also, similar to the bound state problem, you 
will have to conclude that the transfer-matrix equation T ^ vo = V 4 in this case has 
non-zero solutions only if the element in the second row and second column of T™ 
vanishes. Equation 12.63 then yields 

— \r\ | 2 exp (ikw + i<p t ) + exp (—ikw — i(p t ) = 0 

This equation can be transformed into a more convenient for further discussion 
form: 


exp (2 ikw + 2 icp t ) = exp (2 ig) = - -z (12.78) 

In I 

where I brought back the same notation for the phase g = kw + (p used when 
discussing tunneling resonances. Trying to solve this equation, for instance, by 
graphing its left-hand and right-hand sides, you will immediately realize that this 
equation does not have real-valued solutions (it’s left-hand side is complex-valued, 
while the right-hand side is always real). More accurately, I shall say that this 
equation might have real solutions only if \r\ | is equal to unity for all frequencies, 
which is equivalent to the requirement that the thickness of the barriers d becomes 
infinite. If this is the case, Eq. 12.78 can be satisfied if 2g = 2 7 tn, which is, of 
course, just Eq. 12.69, and I hope that by now you have already demonstrated that 
this equation is equivalent to Eq. 12.50 in the limit of the infinitely thick barriers. 
We, however, are interested in the situation when the barriers are thick but finite, 
so that |n | 2 is less than one, but not by much. Using the probability conservation 
equation, 1 1 \ | 2 + | r\ | 2 = 1,1 can rewrite Eq. 12.78 as 

exp (2ig ) = - - -2 (12.79) 

and using condition \t\ | 2 1, approximate it as 

exp (2 ig) & 1 + |^| 2 (12.80) 

where I used well-known approximation 

(1 + x) 01 ^ 1 + ax 

which is just the first two terms in the power series expansion of function (1 + x) a 
with a = — 1. Expecting that the solution to this equation deviates only slightly 
from g n = 7 tn , I will present g as g = Tin + /, where / « 1. The exponential 
left-hand side of Eq. 12.80 in this case becomes 
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exp (2 inn + 2 i/) = e 2l/ & 1 + 2// 

and substituting it into Eq. 12.80, 1 find that / is a purely imaginary quantity equal to 



Thus, the wave number satisfying Eq. 12.80 acquires an imaginary part defined by 
equation 


kw + (p t = nn — -i | 


nn - iT 

2 


(12.81) 


where I used Eq. 12.73 to replace \t\\ 2 with T. 

Gazing for some time at Eq. 12.81, you will realize that something not quite 
kosher is happening here. When starting the calculations, we postulated that the 
wave function at infinity is described by propagating waves with real wave numbers. 
Well, it turns out that it is not possible to keep this assumption and satisfy all other 
boundary conditions. If you now substitute Eq. 12.81 into the exp (=b ikz), it will turn 
into exp [=b/ ( nn — <p t ) z/w] exp (zbTz/2), which explodes exponentially for both 
z < 0 and z > 2d + w. Quite obviously this is not an acceptable wave function 
as it cannot be normalized neither in the regular nor in 5-function sense. So, does it 
mean that all our efforts for the last hour, hour and a half (I guess this is how long it 
would take you to get through this segment of the book, but, believe me, it took me 
much, much longer to write it), were in vain? Well, not quite, of course, why would 
I bother you with this if it were. What I want to do now to save my face is to take 
the phase g ( E ) in Eq. 12.81 and expand it as a function of energy around the point 
E n , where E n is a resonant frequency obeying equation g ( E n ) = irn. It will give me 


g (E) zz nn + ( dg/dE) (E — E n ) , 


which I will substitute to Eq. 12.81 


nn + (dg/dE) (E — E n ) = nn — -iT =4> 


E = E n — -iT 


dg_ 

dE 


-i-i 


— E n — - i r /. 


(12.82) 


where I used Eq. 12.75 to introduce energy HWHM parameter T E - Quite clearly, 
Eq. 12.81 cannot be satisfied with real values of energy, so that the found solutions 
cannot be eigenvalues of a Hermitian operator, (which must be real), and of course, 
they are not. The problem, which we have been trying to solve, lost its Hermitian 
nature once it was allowed for the wave function not to vanish at infinity. So, if the 
found solutions are not “true” energy eigenvalues, what are they? Can we prescribe 
them at least some physical meaning? Well, just by looking at Eq. 12.82, you can 
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notice that its real part coincides with the energy of the tunneling resonances, while 
its imaginary part is equal to the HWHM parameter of those resonances. Plugging 
this equation into the time-dependent portion of the wave function (which has been 
ignored so far), exp (— iEt/fi ), will get you 

x/r(t) oc e -iEntlh e -\r E t/h (12.83) 

The respective probability distribution, which normally wouldn’t depend on time, is 
now exponentially decreasing: 

P = |VT oc exp (-r E t/fi) (12.84) 

with a characteristic time scale r E = h/T E . And Eq. 12.84, actually, admits quite 
a natural physical interpretation. To see this, you need to recall the very initial 
assumption that I made starting discussing this approach to the resonant tunneling. 
The question I posed at that time was: What would happen to a particle placed in a 
stationary state of a potential wall with infinitely thick potential barriers if the width 
of the barriers would become large but finite? Physical intuition told us that a particle 
in this case would be able to tunnel out of the well through the barriers, which means 
that the probability to locate the particle inside the well would diminish with time. 
We can understand Eq. 12.84 as a formal description of this decay of probability 
due to tunneling. Thus, even though the wave functions, which I calculated, do not 
appear to have much physical or mathematical meaning, the complex eigenvalues 
given by Eq. 12.82 contain an important physically relevant information: its real part 
yields the energy of the tunneling resonances, while its imaginary part describes 
both the resonance width, T E , and the time of the decay of the probability due to 
tunneling r. 

These complex eigenvalues are called quasi-energies , while respective states are 
known as “quasi-stationary states,” “quasi-modes,” or “resonance states.” The term 
quasi-stationary implies that a particle placed in such a state would not stay there 
forever and would tunnel out during some time; the time r E can be understood as an 
average lifetime of such states. It is remarkable that the product T E r E is simply equal 
to Planck’s constant fi , making relationship between the width of the resonance and 
the lifetime of the respective quasi-stationary state similar to the uncertainty relation 
between, say, coordinate and momentum operators. 

More accurate treatment of such time-dependent tunneling requires solving the 
time-dependent Schrodinger equation with an appropriate initial condition, but even 
such, less than rigorous, but intuitively appealing approach, can be used to infer 
important information about behavior of the particle in potentials similar to the 
double-barrier potential considered here. This approach, together with the concept 
of quasi-stationary states, was first introduced by George Gamow, an influential 
Russian-American physicist, born in Odessa (Russian Empire, presently Ukraine), 
educated in Soviet Union, and defected to the West in 1933 as Stalin’s purges began 
to intensify. (One of his closest university friends, Matvei Bronstein, was executed 
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Attractive nuclear 
potential well 



Fig. 12.7 A schematic of a potential barrier experienced by an alpha-particle inside of a nucleus 


by Soviet authorities in 1938 on trumped-up treason charges.) Gamow introduced 
these states (sometimes called Gamow states) while developing the theory of alpha- 
particle radioactivity. His idea was that the alpha-particles contained inside of the 
nucleus of a radioactive atom experience a potential in the form of a well followed 
by a thick, but finite, barrier (see Fig. 12.7). 

Radioactive decay in Gamow’s theory was understood as a slow tunneling of a- 
particles out of the nucleus. Gamow’s approach can actually be modified to give 
it more mathematical rigor and to turn the wave functions representing the quasi¬ 
states into physically and mathematically meaningful objects. However, the modern 
variations of the concept of quasi-states is a topic lying far outside of the scope of 
this book, so let me just finish this chapter now. 


12.3 Problems 

Problem 148 Verify that the matrix equation, Eq. 12.8, is, indeed, equivalent to the 
system of equations, Eqs. 12.6 and 12.7. 

Problem 149 

1. Write down boundary conditions for the wave function and its derivative 
presented in Eq. 12.11. 
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Fig. 12.8 Potential described 
in Problem 7 


v 



(T (T R} 

2. Find relations between amplitudes Aj ’ , j and B i 2 , which would 

convert the boundary conditions found in Part I of the problem into Eqs. 12.13 
and 12.14. 

Problem 150 Demonstrate that coefficients and B ( 0 ^ in Eqs. 12.46 and 12.47 
coincide with coefficients A 2 and B 2 in Eqs. 6.60 and 6.61 in Sect. 6.2. 

Problem 151 Show that coefficient B^ in Eq. 12.48 vanishes. 

Problem 152 Prove that Eq. 12.53 is, indeed, reduced to A 2 = ±Bq for energies 
satisfying dispersion equations 12.51 and 12.52. 

Problem 153 Use the transfer-matrix method to find the equation for the bound 
state in the asymmetric potential well: 


V(z) = 


Vi 

< Vw 


V 2 


z < 0 
0 < z < d 
z> d 


where V 2 > V \. 

Problem 154 Consider an electron moving in a potential described as (see 
Fig. 12.8) 


V(x) 


(X) 

v < 0 

0 

0 < v < w 

v b 

w < x < w + d 

0 

x > w + d 


You are interested in the properties of the electron in this potential for energies 
0 < E < Vb. It is clear that for the particle incident on this potential from the left, 
which is the only direction it can be incident from, the probability of reflection is 
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always equal to one, simply because the wave function at z < 0 vanishes and there 
can be no transmitted particles. So, it appears that the effects of tunneling resonance 
discussed in Sect. 12.2 have no place in this potential. At the same time, in the 
limit d —> oo, this potential allows for at least one bound state localized mostly 
within the region of the potential well. When the barrier width d becomes finite, this 
bound stationary state begins leaking outside due to the tunneling effect, just like we 
discussed in the section on the resonant tunneling. Accordingly, we must expect this 
potential to possess quasi-stationary states, but it is not clear how they are related to 
the reflective properties of the potential. I suggest that you try to figure it out. 

1. First, find the amplitude reflection coefficient assuming that to the right of the 
potential, there are both incident and reflected waves, so that the column vector 
representing the wave function for z > w + d would look like 



(As always, the first element is occupied by the amplitude in front of the wave 
propagating to the right, but in the case under consideration, this wave represents 
reflected particles.) The wave function for 0 < z < w must turn zero at z = 0, 
which is achieved by the function of the form 

f(z) = Ae ikz -Ae~ ikz = Ae ikw e ik{z ~ w) - Ae~ ikw e~ ik{z ~ w) 

so that the wave function to the very left of the potential discontinuity at z = w 
is presented by the column 


■ Ae ikw 

-Ae~ ikw 


The wave function for z > w can be found using standard combination of the 
interface and free propagation matrices (you will need two of the former and 
one of the latter). Complete the transfer-matrix calculations and find parameter r. 
Look for any traces of possible resonant behavior paying special attention to the 
phase of r. 

2. Now repeat these calculations assuming that there are no incident particles so 
that the wave function for z > d + w is represented by a column 



Derive equation for the complex quasi-energies of the respective quasi-stationary 
states. Assuming that the imaginary part of the quasi-energies is small, separate 
this equation into an equation for the real and imaginary parts, just like we did in 
Sect. 12.2. Compare the results with those of the preceding calculations. 








Chapter 13 

Perturbation Theory for Stationary 
States: Stark Effect and Polarizability of 
Atoms 


Check for 
updates 


Only few models in quantum mechanics allow for an exact analytical solution. 
Most of the problems, which are relevant to the real-world situations and are 
important for understanding the fundamental nature of things or for applications, 
can only be solved using one or another type of approximation. In this chapter I 
will introduce a method designed for finding approximate solutions for eigenvalues 
and corresponding eigenvectors of a time-independent Hamiltonian with a discrete 
spectrum. This method works for Hamiltonians that can be written as a sum of 
two parts: the main or unperturbed Hamiltonian // 0 , whose eigenvalues, Eg°\ and 
eigenvectors, | s), are presumed to be known, and a perturbation eV. Parameter 6 
that I pulled out of V has a formal meaning of the strength of the perturbation, but 
this can be understood literally only in the sense that the perturbation vanishes when 
6 = 0. The actual parameter determining the strength of the perturbation emerges 
only post factum , after the problem is solved. I will mainly use 6 as a technical 
bookkeeping device (you will know what it means when you see it) and set it equal 
to unity at the end. The index y appearing in the notation for the eigenvalues and 
the eigenvectors can be a composite index, consisting of several subindexes. For 
instance, if H 0 is the Hamiltonian of a hydrogen-like atom, then s contains principal, 
orbital, and magnetic numbers n, /, m. It is also presumed that the perturbation V 
can be considered small in some yet undefined sense so that the eigenvalues and 
eigenvectors of the total Hamiltonian 

H = H 0 + eV (13.1) 

do not deviate too much from the and | s) correspondingly. Consequently, 
one might hope that they can be found approximately using the eigenvalues and 
eigenvectors of Ho as a starting point. 

The development of the method is quite different for non-degenerate unperturbed 
eigenvalues and the degenerate ones. You can see where the difference is coming 
from by pondering over the following. Whatever the approximate expression I will 
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derive for the eigenvalues and eigenvectors, they must reduce to Eg and | s^) as I 
set 6 = 0. If Es°' is a non-degenerate eigenvalue so that | s) is the only eigenvector 
belonging to it, this process does not raise any issues. If, however, E^ is degenerate, 
meaning that there are several unperturbed orthonormal eigenvectors 


y belonging 


to it, together with infinitely many linear combinations thereof, an outcome of the 
transition 6 —> 0 in this case becomes a mystery. You will learn eventually that as 
improbable as it might sound, it is the perturbation operator V that “decides” the 
outcome of this transition even as its own “strength” goes to zero. You can consider 
these somewhat vague remarks as a teaser designed to spur your curiosity. All this 
will become (hopefully) much clearer once we get down to it. At this point, my 
goal is simply to justify the importance of separate consideration of degenerate and 
non-degenerate unperturbed eigenvalues. 


13.1 Non-degenerate Perturbation Theory 

The non-degenerate case is more straightforward, so this is what I am going to 
begin with. The idea is to present the unknown eigenvalues E s and eigenvectors | s) 
as power series of the form 

E s = + e 2 Ef -> + e 3 £f 1 + • • • (13.2) 

| s) = |.y (0) ) + e |s (1) ) + e 2 | s (2) ) + e 3 |s (3) ) + • • • (13.3) 

and plug them into the stationary Schrodinger equation: 

(Ho + ev) \s) = E s \s ). (13.4) 


This procedure yields 


Ho K 0) ) + eHo I^ (1) > + € 2 H 0 |s (2) ) + • • • + 
eV |s (0) ) + e 2 V |^ (1) ) + e 3 \> | s (2) ) + ■■■ = 

£ s (0) |^ (0) ) + e£ s (0) | s(1) ) + * E s 1} k (0) > + ^ E s 0) | s(2) ) + (13.5) 

e 2 £f | S (0) ) + e 2 £< 1) | J (1) ) + ... . 


For this expression to be true for an arbitrary value of the perturbation parameter 6, 
it is necessary that the terms with the same power of € on the left-hand side and on 
the right-hand side of this equation were individually equal to each other: 
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e° : H 0 |s (0) ) = £ s (0) |s (0) ), (13.6) 

e 1 : H 0 |s (1) ) + V |s (0) ) = £ s (0) \s m ) + £< 1} \s (0) ), (13.7) 

e 2 : H 0 |s (2) ) + V |s (1) ) = £ s (0) |s (2) ) + E < 2) |s (0) ) + E^ |s (1) ). (13.8) 

Now you can see what I meant in saying that € will only be used for bookkeeping 

purposes: I use it to identify different approximation orders, and once it is done, it 
can be set to unity. 

Equation 13.6 is just the eigenvalue equation for the unperturbed Hamiltonian, 
which, as I presumed, is fulfilled by E® and l^). Corrections to the eigenvalue and 
the eigenvector proportional to € (I will call them the first-order corrections) should 
supposedly be found from Eq. 13.7. You might wonder if it is possible to find both 
these unknown quantities from a single equation. Well, let’s see. 

First, I multiply Eq. 13.7 by ( 5 ^ | from the left: 

(,(0) | h q |,(b) + (,(0) | y |/») = E (0) (,(0) | j( l)J + E (D (,(0) 1^(0)) . 

Taking into account that |^) = 1 (normalization) and that ( 5 ^ | Ho = 

| (Hermitian property of the Hamiltonian), I transform this expression into 


El + (/» | V |. (0) ) = E^J^s^ + E$\ 

which gives me the first-order correction to the energy eigenvalue 

E<p = (^ (0) | V |^ (0) ). (13.9) 

So far so good—I got the correction to the energy, but how about the correction to 
the eigenvector, \s^) 9 which got canceled out? But fret you not—the cancelation of 
the l^ 1 )) is not a bug, but a feature, which allowed me to isolate and determine the 
energy correction. In order to obtain \s^), I need to do something else, something 
which would eliminate the £+ term from Eq. 13.7. One way to achieve this is 
to premultiply the equation by an eigenvector of the unperturbed Hamiltonian, 
different from ( 5 ^ |, say, [q^ |, where q = 1,2,3,^—1,5+1,••*. Indeed, in this case 
the term with vanishes because of the orthogonality condition: [q^ | s^) = 0, 
for q ^ s. The remaining expression in this case becomes 

(<? (0) | H 0 | 5 (1) ) + (<? (0) V | 5 (0) ) = £< 0) (<? (0) | 5 (1) ) =+ 

Ef [iq (0) |5 (1) ) + V qs = E f (q<® |5 (1) ) , 

where I again used (< q ^ | Ho = Eq [q^ | and introduced the matrix element V qs = 
[q^ | V | s^) (note that the order of indexes in V qs follows their order in [cf^ V | 
from left to right). The result is an equation for (q^ |5^), which yields 
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I*'") = (i3 - io » 

This quantity is a component of the unknown vector | s^) in the direction of the 
vector \q^) and can be used to find the entire vector l^ 1 )) as follows. Since \q^) 
are eigenvectors of a Hermitian operator and, therefore, form a basis, I can expand 
l^ 1 )) in this basis as 


D = = 

q 

Ek (0) >(^ (0) l s(1) ) + < s(0) l s(1) )k (0) )’ 

qj^s 


where in the sum in the second line I separated out the term with q = s. With 
[q^ l^ 1 )) found, I am just one step away from finding the entire vector |^): all I 
need is the value of | s^), which so far remains unknown. The help comes from 
a familiar place—the normalization condition. Consider the found eigenvector with 
accuracy up to the first order in e: 


s = s 


(0) 


)+ 


€[S 


,(°)| o(!)\ I o( 0 ) 


) + e E 


V, 


qs 


q+s 


,(0) 


7 ( 0 ) 


<z (0) ), 


and compute its norm (s\ s): 

(s\s) =(s^\s^) + €{s^\s^){s^\s ( ^) + 


E 


V, 


qs 


77(0) Z 7 ^) 

q±s ^ ^9 


M \J 0 ) 


) + e E 


y* 

v qs 


q^s 


77(0) _ 77(0) 

J-'S J-'a 


w )l s °) + 0 (e z ). 


I cannot include in this expression the terms of the second order in e 2 or higher 
because other terms of the same order were omitted from the initial expression 
for | s). The second line in this expression is actually equal to zero because of the 
orthogonality condition | q^) = (q°)\ ^v 0 ) = 0, so all that is left is 

(S\S) = 1 + 6 ^|^), 

where I took into account that | = 1. Thus, if I want the norm of \s) to be 

equal to unity, I have to set | s^) = 0. This is the last piece I needed to find the 
first-order correction to the eigenvector, which I can now write down as 


I s) = |^ (0) ) + e E 


V, 


qs 


7 ( 0 ) 


q+s 


e1 0) - e: 


,( 0 ) 


)• 


(13.11) 
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So, I am done with the first-order corrections to the eigenvalues and eigenvectors, 
and now I can turn to finding the corrections proportional to 6 2 , which are often 
called second-order corrections. Going back to Eq. 13.8 and performing the same 
magic trick of premultiplying it by |, I find 

(s (0) I Ho |s (2) ) + (.s (0) | V |s (1) ) = 

E (0) ^(0) |^(2)j + e (2) | j( 0) | s (0)j + £(1) ^(0) |^(l)j _ 

The first term on the left evaluates to E |s®) and cancels the first term on 
the right, while the remaining expression, remembering that (7 0) |.v (01 ) = 1 and 
(7°> (7 1 ') = 0, becomes 


E s (2) = (s (0) |v|s (1) ) = ^]' 


v sq v qs 


q^s 


r (0) 


7 ( 0 ) 


= E 

q+s 


Ws 


sq\ 


r(0) 


7 ( 0 )’ 


(13.12) 


where I again introduced a matrix element V sq = | V \q^) and took into account 

that V qs = V* q . 

Premultiplying Eq. 13.8 by [q^ |, I obtain 


(q {0) \H 0 \s {2) ) + (q {0) \v\s il) ) = 

4°) |*®) + [q<® | *«») + £<D | s <»). 


Using again the Hermitian property of the Hamiltonian to compute the first term in 
the first line and orthogonality of the zero-order eigenvectors to eliminate the middle 
term in the second line, I turn this expression into 

Ef (</ 0) |s (2) ) + [q^ | V |s (1) ) = (iq (0) |s (2) ) + E<» (tq (0) |*«). 

Now, using Eq. 13.9 as well as Eqs. 13.10 and 13.11, I can convert it into the 
following: 


(E^ - £ s (0) ) (q (0) |s (2) ) = (s (0) | V\s' 


( 0 )\ 


V, 


qs 


E 7 (0) Z7(0) 


-E 


VqpVps 
eU) Z7CH) 

T-7v -C/n 
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Respectively, the second-order correction to the eigenvector becomes 


s OI) = 


q^s I 


V ss v qs 


^7(0)_j^(0) 


yi«*')+EEor" 

I c l¥=0 PJ^S ( C V 


VgpVp, 


4 0) ) ( 4 0 ) - 4 0) ) 


k (0) >. 


(13.13) 


where I again set |^) to zero as a normalization condition. Equations 13.12 
and 13.13 complete my program of derivation of the lowest-order corrections to 
the non-degenerate eigenvalues and the eigenvectors of Hamiltonian H. Finding the 
third- and higher-order corrections becomes progressively more cumbersome and is 
rarely necessary. 

Ideally, from the point of view of minimizing one’s efforts, it would be preferable 
to get all the answers just from the first-order terms. Often, however, as we already 
discussed in Chap. 10 on the two-level model, and Sect. 7.1.2 on quantum harmonic 
oscillator, the diagonal elements of the perturbation part of the Hamiltonian, those 
that determine the first-order corrections to the eigenvalues, vanish. This happens, 
for instance, when the unperturbed Hamiltonian Hq is invariant with respect to the 
parity operator, so that its eigenvectors can be classified as being even or odd. If, 
in addition, the perturbation operator V is odd (changes sign upon application of 
the inversion operator), then (^ 0) | V |^) vanishes and takes along the first-order 
correction to the energy. If the first-order correction to the energy turns zero, you 
do not have a choice as to rely on the second-order correction. If the second-order 
terms are not sufficient as well, it usually means (there are exceptions, of course) 
that the perturbation approach is not suitable for the problem at hand. 


13.1.1 Quadratic Stark Effect 

The perturbation theory plays a crucial role in understanding the responses of a 
quantum system to external influences such as electric or magnetic fields. Leaving 
the effects due to a magnetic field for a separate chapter, in this section I will focus 
on the interaction between an atom and an electric field, £, which, in many instances, 
can be assumed to be spatially uniform. I will also assume that this electric field is 
static, i.e., does not depend on time. Then, what I need to do as the first matter 
of business is to find the electric field-induced corrections to the energy spectrum 
and corresponding eigenvectors of the unperturbed Hamiltonian. When this is done, 
I can move on to analyze how these changes manifest themselves in observable 
phenomena. 

As a practical example, I will consider the effects of the static electric field on 
the ground state of a hydrogen atom (Chap. 8), which is the only non-degenerate 
energy level of hydrogen and can be, therefore, studied using the developed method. 
So, assuming that in Eq. 13.1 describes a hydrogen-like system considered 
in Chap. 8, I can replace the abstract zero-order eigenvectors |^) of the previous 
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section with their more concrete version \nlm) and energy eigenvalues with = 
— E g /n 2 . The ground state is described by eigenvector 1100) and energy — E g , the 
expression for which can be found in Chap. 8. Indexes s and q are now replaced by 
three indexes, n , m, and /, and summation over q involves the summation over all 
three indexes subject to regular restrictions on their values determined in Chap. 8. 
The operator of perturbation V for a uniform electric field has a simple form, which 
you already encountered in this book several times, for instance, in Chap. 10: 

V = —e££. (13.14) 

Here I chose the Z-axis of the coordinate system used to define the components of 
the operators of the angular momentum along the direction of the electric field. The 
first-order correction to the ground state vanishes as was explained above because 
the perturbation potential is odd with respect to inversion. Those who are skeptical 
about symmetry-based arguments can verify this statement directly working, for 
instance, in the position representation 

71 2ti oo 

( 100 | V 1100 ) = — e£ — j J j dOdcpdr sin Or 2 [Riq( r)] 2 r cos 0, 

0 0 0 

R\o(r) is the radial component of the wave function representing the ground state 
of the hydrogen-like system, and I converted z into spherical coordinates. Now 
you only need to compute the integral f* dO sin 6 cos 6 over the polar angle 6 to 
convince yourself (which shouldn’t be too difficult) that it vanishes. 

With this issue clarified, let’s take on a more difficult problem of finding the 
second-order correction to the ground state energy using Eq. 13.12. The most 
important thing to realize when using this equation is that the sum over q is now 
three sums: over n , /, and m. The sum over n starts with n = 2, because the term 
n = 1 is excluded from the summation by the condition q ^ s, which in our case 
translates ton ^ 1, the sum over l runs from 0 to n — 1, and the sum over m covers 
values from — / to /: 


oo n —1 l 


£ i 2) = EE e 


IVioo, 


nlm | 


1 * 1 * 1 * 17(0) _ 77(0) 

n=2 1=0 m=—l ^1 n n 


00 n— 1 / 


kioo.j 


nlm | 


{ * 17(0) _ 77(0) 

n =2 1=0 m=—l ^1 


(13.15) 


It should be noted that Eq. 13.15 does not tell the entire story since the spectrum 
of a hydrogen atom contains also a continuous segment, which, strictly speaking, 
needs to be included. Moreover, since the potential due to a constant external 
field grows progressively more negative with the growing value of coordinate z, 
at some point the total potential felt by the electron will become less negative 
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than a given negative energy of a bound state, providing for a possibility for the 
electron to tunnel out of the nucleus’s potential (at which point the atom becomes 
ionized). This turns the stationary state of the electron into quasi-stationary, similar 
to the situation considered in Sect. 12.2. All these complications, however, can be 
ignored for a weak enough field because (a) for states from continuous spectrum, 
the energy denominators in Eq. 13.12 become so large that the contribution from the 
corresponding terms can be safely neglected, and (b) even though the bound states 
become formally quasi-stationary, if the field is not too strong, their lifetime is long 
enough to treat them as normal stationary states. 

Now my task is to compute the matrix elements Z\oo,nimi which, using the 
hydrogen wave functions from Chap. 8, can be written down as 

TV 2n OQ 

Zm,nim = -/== J j J d9d(pdrsin9r 3 cosBY™ (9,(p)R n j(r)R lfi (r). (13.16) 
ooo 

To evaluate the angular portion of the integral 


71 2tz 

// 

0 0 


dOdcp sin 0 cos 0Y™ ( 0 , <p) 


let me first notice that this integral vanishes for all values of the magnetic number 
m, with exception of m = 0. To see this just recall that the spherical harmonics 
Y™ (0, cp) contain the factor exp(/m<^), which in this integral is the only factor 
containing the azimuthal angle <p. Integration of exp (imcp) over the entire range 
of (p between 0 and 2 tt yields zero unless m = 0, when the value of the integral 
becomes 2 tt. Having disposed of the integration with respect to cp, I am left with the 
integral over the polar angle 


71 2 71 


J j dOdcp sin 0 cos 6Y™ (9, (p) — 2tt 
0 0 




S m ,o / d0 sin 6 cos 9 Pi ( 6 ) = 


i 

y/n (21 + l)<5 w ,o / xPi(x)dx , 


where I replaced the spherical harmonic Yf with the regular Legendre polynomials 
and made a substitution of variables x = cos 6. To evaluate the remaining integral, 
I recall that x = P\ (x) so that the last expression can be rewritten as 


yjzt (21 + l)5 m> o 


P\(x)Pi(x)dx. 


-l 
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All what is left now is to invoke orthogonality of the Legendre polynomials and use 
Eq. 5.71 from Sect. 5.1.4 adapted to the case m = 0: 

l 

J P h (x)P,(x)dx = ^-=-_ -8 Ul . 

-1 

Applying this formula to the case under consideration ( l\ = 1), I obtain the final 
result for the angular part of the matrix element: 


71 2ti 


J J dOdcp sin 0 cos QY™ (9, <p) = 


0 0 


2 2^/jt 


8 m ,o$n — —^=-8m,o8n- 


Substituting this result into Eq. 13.16, 1 get 


£ 100 ,nlm ~ J dn~^R n \ (r)R\ o(r). 

o 


Replacing the integration variable r with its dimensionless counterpart v = Zr/(a#), 
where all notations are taken from Chap. 8, the expression for the matrix element can 
be recast as 


oo 

dxx 3 R ns i (x)R i o (v). 

0 

The radial wave functions can be read of Eq. 8.21 for the hydrogen wave functions. 
In terms of dimensionless variable x, the ground state (n = 1, l = 0) function takes 
the form 


£ 100 ,nlm ~ —/=8m,o8ll (^) 


* 1,0 ( x ) — 2 



exp (—x) 


(recall that the Laguerre polynomial Lq (2x) = 1), while R n ,i(x) becomes 


R n , 1 (*) = 



in- 2) 

J 2n (n + 1) 
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Substituting these expressions into the formula for the matrix element, I end up with 


£ 100 ,nlm ~ ~^8m,o8ll ( — 


©wo m 

00 

J dxx 3 xe~ x e~ x ^ n L 3 n _ 2 ^ ^ dx = 


(ft — 2)! 2 


2 ft (ft + 1 )\n 


<Vo<5/l 




<2b 


8 m ,o8nff(n), (13.17) 

where I introduced function /(ft), which depends only on the principal quantum 
number n : 


„ _8__L / ( w ~ 2 ) ! 7 

\/3 « 3 V (n + 1)! 7 


dxxV^+^Z 3 (- 




w / 


(13.18) 


Now the second-order correction to the ground state energy can be written down as 

- t (2) _ 87 t£ r £o 2 3 


00 9 


e?’ = -^ 2 4 E^n/ 2 («). 

n=2 


(13.19) 


where I replaced with their actual values — /ft 2 and used explicit expression 

for in terms of Bohr radius using Eq. 8.17. In principle, function/(ft) can be 
found analytically for an arbitrary n , but the result is not worth the effort since we 
will end up with a nasty looking sum over n , which at any rate we wouldn’t be 
able to find exactly. Thus, instead, I will evaluate/(ft) only for ft = 2, 3, 4, 5 and 
use the results to compute El approximately, including only these terms into the 
sum. To help you digest and reproduce these computations, I am providing some 
intermediate results in Table 13.1. 


n 

Ll-i(2x) 

/(«) 

2 

1 

0.7449 

3 

4 — 2* 

0.2983 

4 

2(5 -5x4- 5x 2 ) 

0.1759 

5 

20 - 30x + 12x 2 - fx 3 

0.1205 


Table 13.1 Data for 
calculating the quadratic 
Stark effect 
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Using these data, you can now easily evaluate Eq. 13.19 to yield 


7(2) _ 


87T£ r £o 

tF 


£ 2 a 3 B (0.7399 + 0.1001 + 0.0330 + 0.01512 + • • •) 


(13.20) 


_ 47^0 

Z 4 B 


Adding more terms to the sum does not affect the numerical coefficient too much: 
going from four terms to 200 changes this factor by 2%. It is interesting to note 
that the ground state energy of hydrogen in the electric field can be found exactly, 
without resorting to the perturbation theory. This theory is too complicated to 
discuss in this book, but no one can forbid me to use its result for comparison. 
The exact solution produces factor 2.5 instead of 1.78, which is a 20% difference. 
I would say that this is a pretty decent approximation given the difference in the 
amount of efforts required to derive the approximate and exact results. 

Equation 13.20 can also be recast in another illuminating form. By multiplying 
the numerator and denominator of this equation by e 2 , you can notice that the 
resulting expression contains the ground state energy E g in its denominator. Making 
this fact explicit, you can rewrite Eq. 13.20 in the form 


,( 2 ) _ 0.89 e 2 a 2 B £ 2 

1 _ “z^ FT 


(13.21) 


where the numerator has a clear physical meaning: ea B £ is the change of the 
potential energy of the electron in the field £ over the distance equal to the “size” 
of the atom expressed by the Bohr radius a B . This expression also makes it much 
easier to get a feeling for the numerical magnitude of the Stark effect, since I know 
that the Bohr radius (assuming that we are dealing with an actual hydrogen atom 
in vacuum) as = 5.29 x 10 -11 m, and using it as a typical value for the electric 
field £ = 10 6 V/m, I find for ea B S in electron volts ea B S %5x 10 -5 eV. Recalling 
that the ground state energy of hydrogen in vacuum is 13.6 eV, I can estimate the 
quadratic Stark effect correction to the energy as (ignoring numerical coefficients of 
the order of unity) being of the order of 10“ 10 eV. This change in energy levels is 
observed by measuring the electric field-induced shift of the absorption or emission 
lines in the hydrogen spectrum. The energy shift of the order of 10“ 10 eV translates 
into a frequency shift of about 10 5 Hz. This shift of spectral lines is what is known 
as the Stark effect, and because in the case considered in this section the shift 
is quadratic in the field, it is qualified as quadratic Stark effect. The effect was 
discovered in 1913 by German physicist Johannes Stark who was awarded for this 
discovery the 1919 Nobel Prize in Physics. Stark was an active supporter of the 
Nazi regime and was closely involved in Deutsche Physik movement, whose goal 
was to cleanse German science from foreign mostly Jewish influence. It was he who 
described Heisenberg as a White Jew after Heisenberg publicly defended Einstein’s 
relativity theory. He was probably the only one famous physicist who after the war 
was sentenced to a prison term for collaboration with Hitler’s regime. 
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13.1.2 Atom’s Polarizability 

As you just saw, the modification of the energy eigenvalues in the presence of the 
electric field manifests itself as a Stark effect—shift of the spectral lines in the 
absorption or emission spectra of an atom. In this section I will focus on an effect 
resulting from the modification of the wave functions. 

The wave functions, ^(r), as you all know, characterize the probability density 
for the electron’s coordinate P(r ) = |i/f (r)| 2 , so that the related quantity p(r) = 
e \ ty(r)\ 2 can be interpreted as a charge density. The easiest way to understand why 
this is so is to imagine that you are dealing with an ensemble of N non-interacting 
electrons (please indulge me here, pretending that such electrons do exist). Then 
NP(r)d 3 r determines the number of the electrons within the volume element d 3 r , 
and since each of them carries a charge e , the charge within this volume would 
be eNP(r)d 3 r. Accordingly, the charge density (charge per unit volume) would be 
NP{r). Now go back to the case of a single electron (N = 1), and you have your 
charge density p(r) as defined just a few lines above. If you want, you can imagine 
an atomic electron as being a continuously distributed charged cloud rather than as 
a point-like object. 

In the absence of the electric field, the wave functions of the electron have a 
definite parity as was discussed in the previous section. The charge density p(r), in 
this case, is always an even function. One of the important consequences of this is 
that the average position vector of the electron f r \ \jrir) \ 2 d 3 r is zero. Quantity 

(d) = — e jr\\j/{r)\ 2 d 3 r 

can be interpreted as an expectation value of a dipole operator d = —er, which is 
also, obviously, zero (negative sign in the definition of the dipole moment reflects 
the negativity of the electron’s charge, e ). I can say that the hydrogen atom does not 
have a permanent (independent of the field) dipole moment. Now, try to imagine 
what would happen if the atom is placed in the electric field. Classically speaking, 
the field will exert a force on the electron shifting it to a new equilibrium position. 
When describing this effect in terms of the electron cloud, you can imagine that 
the cloud is displaced so that its center does not coincide with the position of the 
nucleus (see Fig. 13.1). As a result the expectation value of the electron’s dipole 
moment takes on a non-zero value, dependent on the field. This process is called 
polarization, and a dipole whose dipole moment appears only in the presence of an 
external field is called polarizable dipole. My task in this section is to describe this 
effect quantitatively using the first-order corrections to the electron’s wave function 
given by Eq. 13.10. 

Formally speaking, what I need to do is to compute the expectation value of 
the dipole moment operator using the eigenvector perturbed by the field. The 
computation is pretty straightforward, so let me just get on with it: 
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Fig. 13.1 Spatial distribution 
of the electron charge density 
in the presence of the electric 
field. The large ellipse 
represents the electron cloud, 
while the small black circle is 
the nucleus. The center of the 
cloud is shifted with respect 
to the nucleus because of the 
electric field represented by 
green field lines 
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(I hope you have not yet forgotten that in transitioning from ket vectors to bra 
vectors, you are supposed to complex conjugate all complex quantities.) But, let 
me continue: 


00 n— 1 l 


Vnlm,l 00 


h'i^'EEE,, _» 

n=2 1=0 m=—l ^1 11 n 


(100| d \nlm) + 


00 ft—1 / 
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v nlm, 100 


7(0) 77(b) 


(nlm\d 1100) = 


\Ynlm, 100^100,ft/m] 


17(0) _ 77(0) 

n=2 1=0 m=—l ^1 11 n 


Here is what I have done here. First, I remembered that | d |^) = 0 as we just 
discussed, then I dropped the terms proportional to 6 2 because I have no right to keep 
them. Finally, I used the defining property of the Hermitian operators [cf^ | d |^) = 

| d |^^)*, which allowed me to replace 
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with 


V qs (s (0) | d |<7 (0) ) + V* s (s {0) | d |</ 0) )* = 2 Re [v ?s (s (0) | d |$<°>) . 

oh, yes, I also introduced a dipole matrix element d sq = | d \q^) and replaced 

the abridged indexes s and q with full notation 1,0,0 and n, /, m reflecting the nature 
of the states in question. Now, let me recall that d = —er, and, therefore, ^ioo,«/m = 
—er ioo,nim, while V n i m ,\oos = — e£z n im,\oos, so that, for the dipole expectation value, I 
have 


OO Yl 1 / p r -| 

/ 1^1 \ ~ Ke [Znlm,100?100,nlm\ 

(s\d\s)=2ee ^ - (0) _ F ( 0 ) -■ < 13 ' 22 ) 

n=2 1=0 m=—l 

The matrix element z n im, 100 has been calculated in the previous section, so let’s focus 
on vioo,nim • First of all, you need to recognize that this is a vector, and, therefore, you 
are dealing here with three matrix elements: vioo,«/m> y ioo,nim, and Zioo,nim, where 
v, y, and z are coordinates defined in the same coordinate system that I used to 
describe the Stark effect (Z-axis is along the electric field). The z-component of 
this matrix element, Zioo,nim, is obviously Zioo,nim = z* im 100 , and since the latter is 
real, they both are given by Eq. 13.17. The arguments that led me to conclude that 
Znim, ioo is zero unless m = 0 do not apply to vioo,n/m and y ioo,nim because both x and y 
coordinates, when expressed in a spherical coordinate system, contain the azimuthal 
angle cp: x = rsinOcoscp , y = rsinOsincp. Instead of dealing with xioo , n i m and 
yioo,nim separately, I find it more convenient to deal with x + iy, which in spherical 
coordinates becomes simply r sin 6 exp(z'<p), and the respective matrix element can 
now be presented as 


*100 ,nlm + 100,nim ~ 

7t In oo 


^j= J j J dOdcpdrsinOr 3 sm9e l<p Y™ (0, (p) R n j(r)Ri$(r). (13.23) 


0 0 0 


As it has become clear from the discussion of Zioo,nim, for the integral over (p 
not to vanish, factor exp(/<p) in Eq. 13.23 must be canceled by an appropriate factor 
from spherical harmonics Y™ (9,<p). Such a cancelation is only possible if index 
m in the spherical harmonic is m = —1, indicating that the entire matrix element 
is different from zero only for m = —1. This means that our job here is done: 
we can forget about both xioo , n i m and y\oo,nim components of the matrix element 
(s\d | s). Indeed, Eq. 13.22 contains the product of z\oo,nim with xioo , n i m or yioo,nim, 
or z\oo,nim, which does not vanish only if both factors in the product are not zero. 
We know that z\oo,nim is zero for all m ^ 0, while xioo , n i m and y\oo,nim are not zero 
only if m = —1. No amount of Hogwarts magic can make these two conditions 
be fulfilled simultaneously, assuring that d z is the only non-zero component of the 
dipole expectation value 
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oo yi 1 / d r "I 

/ i j i \ 0 2 [Znlm, lOO^lOO, nlm\ 

(s\d z \s)=2ee ^ (0) _ (0) - = 

n= 2 1=0 m=—l ^ nlm 

oo n— 1 / I |2 

2 « 2 f EEE^ri ( 13 - 24 ) 

n=2 1=0 m=—l ^1 

where, in the last line, I dropped the sign of taking the real part because 
Znim, too^ioo, nlm = \z n im, ioo| 2 is a real quantity. The fact that only z-component of 
the dipole moment survives the process of taking the expectation value shouldn’t 
surprise anyone, of course. In the absence of “special” directions, the direction of 
the external field is the only one which the induced dipole moment of the atom can 
reasonably be expected to have. 

Equation 13.24 demonstrates that the induced dipole moment is codirected with 
the external field and is proportional to it. This linear dependence between the dipole 
moment and the field allows to introduce an important quantity, called polarizability, 
which is defined as a coefficient of proportionality between the dipole moment and 
the field: 


oo n— 1 l I |2 

o \Zrilm, tool 

Ze 2 ^ 2 ^ 2 ^ „( o ) _ t^-(o) • 

n = l /=n m = — 1 


(13.25) 


The expectation value of the dipole operator, (d ), can now be written simply as 


id) = a p £. (13.26) 

The significance of Eq. 13.26 goes beyond this particular example: the linear relation 
between the induced dipole moment and the external field remains valid in a wide 
variety of cases, well outside of the particular model considered here. 

Now, prepare to get surprised. Go a few pages back, find the expression for the 
second-order correction to the electron’s energy, and compare it with Eq. 13.24. To 
make your life even easier, I will copy this expression here for your convenience: 

oo n— 1 / I |2 

77(2) _ 2 p2 \Zl00,nlm\ 

1 e 2-^i 2^ 2-^i 77 ( 0 ) _ rp(0) ' 

n=2 1=0 m=—l ^1 

Do you see that the expression for the correction to the energy level in the presence 
of the field, which can be interpreted as a change of electron’s energy due to the 
field, can be written down as 


Ef = X -<x p S 2 l. 


( 13 . 27 ) 
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What is remarkable about this expression is that it represents the energy of a 
classical polarizable dipole. Here is how it can be derived using nothing but classical 
electrostatics. You probably remember that the energy of a dipole d in a uniform 
electric field E is Ud = d>£. This formula, however, was derived for a permanent 
dipole, namely, the dipole whose dipole moment does not depend on the external 
field. In the case under consideration, d is dependent of the field, and this expression 
has to be modified. To take into account that the dipole moment changes with 
the field, I will build the dipole moment by small increments 8d = a p 8E. If the 
respective change of the field 88 from a given value £ is infinitesimally small, 
one can neglect the corresponding change of the dipole moment and use for the 
corresponding increment in the energy the formula 


8Ud — 8d • £ — a p £8£ = a p £8£, 


where in the last step I assumed that £ and 8E have the same direction, so that I can 
replace both vectors with their magnitudes. To find the total energy of the dipole 
induced by field £, I now need to add together all these small increments starting 
from the ones corresponding to zero field and finishing when the desirable value of 
the field is reached. I probably did not have to use that many words to describe a 
standard operation of integration and could write right away 

£ 

U = a p J £S£ = 8x p £ 2 , 

0 

which is exactly Eq. 13.27. Probably, this is just my personal quirk, but at some point 
in my life, I wondered how come that the quantum expression for the energy of the 
dipole d-£ which goes into the Hamiltonian and is valid even for induced dipoles 
does not have the factor 1/2, while the classical version of the same expression for 
polarizable dipoles is d*£/2. How can this be reconciled with the correspondence 
principle, for instance? After learning how to compute the polarizability of atoms, 
I understood that the classical expression must be compared with the expectation 
value of the dipole moment, while the quantum formula is an operator expression. 
Once you go from the operator to the expectation value, the classical and quantum 
expression for the energy of the polarizable dipole start agreeing. 


13.2 Degenerate Perturbation Theory and Applications 
13.2.1 General Formulation 

If we are interested in finding the corrections to a degenerate energy eigenvalue, 
the approach presented in Sect. 13.1 must be modified. In the introduction to this 
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chapter, I have already outlined a conceptual need for a modification. Namely, 
I pointed out that in the degenerate case, the starting point of the perturbation 
expansion (the zero-order approximation) is not clearly defined. There is also 
another, more technical argument justifying the demand for a modified approach. If 
you look at Eq. 13.10 for the first-order corrections to the eigenvector or Eq. 13.12 
for the second-order corrections to the eigenvalue, you will notice the energy 
denominator and the restriction q ^ s on the summation. This restriction 

assures that the energy denominator does not vanish resulting in a blowout of the 
respective term. While this restriction works very well in the non-degenerate case, 
if Eg 0) is degenerate, the condition that q ^ s does not guarantee that E^ ^ E^ 
because there can be eigenvectors differing in some of their quantum numbers but 
having the same energy. The new perturbation theory, in addition to resolving the 
ambiguity about the zero-order term, must also ensure the absence of the exploding 
terms in the perturbation expansion. 

I will begin by refining the notation for the eigenvectors of the unperturbed 
Hamiltonian belonging to degenerate eigenvalues. To this end I shall add an extra 
index reflecting degeneracy turning |^) into |^ 0 \A). Index s, as before, will 
enumerate different eigenvalues, while the second index A, which can combine 
several indexes, will distinguish between eigenvectors belonging to the same 
eigenvalue. In the case of the hydrogen atom, for instance, q would be the principal 
quantum number n , while X would correspond to two indexes l and m. 

Since I am not sure which of the eigenvectors belonging to a degenerate 
eigenvalues shall be used as a zero-order term in the perturbation expansion, I will 
begin with an arbitrary linear combination of those hoping that somehow I will be 
able to figure out what the “correct” combination is. Accordingly, in the derivation 
outlined in Eqs. 13.2-13.8, 1 will replace | s^) by 

\ Xs ) = J2 a ^\ s{0) ’ x )’ < 13 - 28 ) 

A 

so that Eqs. 13.6 and 13.7 become 

AXA k (0) ’ A ) = E s 0) ZA k (0) ’ A ) (13 - 29) 

A A 

H 0 |s (1) ) + V |s (0) , A) = E < 0) |s (1) ) + E, (1) \s (0 \X). (13.30) 

A A 

Equation 13.29 is fulfilled automatically by virtue of |^°\A) being eigenvectors 
of the Hamiltonian H^°\ so I only need to deal with Eq. 13.30. I shall follow the 
same procedure as in the non-degenerate case and first premultiply this equation by 
|: 
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y~^a* (5 (0) , (i\Hp |s (1) ) + ^2^2axa* (s (0) ,/x| V |s (0) , A) = (13.31) 

/x A M 

£f£<(* (0 U| k (0) ’ A >- 

/X AM 

Recalling that , /a |//^ = , (i |, you can again, just like in the non¬ 

degenerate case, cancel the first term on the left-hand side of Eq. 13.31 with the 
first term on the right-hand side. Taking into account also that the eigenvectors, 
even those belonging to a degenerate eigenvalue, are orthogonal, you can rewrite 
Eq. 13.31 as 


£ ( j(0) ’ H v k (0) ’ A ) = E i l) £ a ^ a l => 

Am m 

£ fe v i \ a ^ - aa)=° =>• 

M V A / 

= ( 13 - 32 ) 


where I introduced a perturbation matrix = (s ^, fi | V |^°\A) constructed 
using vectors \s^°\ X) belonging to the same degenerate subspace defined by the 
eigenvalue Es°\ Equation 13.32 replaces Eq. 13.9 for degenerate eigenvalues. After 
prolonged staring and maybe some meditation, you will recognize that this is your 

(s) 

old acquaintance: an eigenvalue equation, written this time for matrix V^ J . The 
eigenvalues are found, as usual, as zeroes of the determinant 


y: 


(s) 


A ,m 


■E ( ' } 8 


'A ,m 


o, 


(13.33) 


and substituting each of the solution to this equation (sometimes called secular 
equation) back to Eq. 13.32, you will find sets of corresponding coefficients ax 
engendering the “correct” zero-order eigenvectors. 

To help you see a more clear picture of what you are dealing with here, I will now 
consider, as an example, a simplest possible case of a double-degenerate eigenvalue. 

(Vj 

In this case the indexes X , /x take only two values so that V\ J becomes a 2x2 matrix. 
The secular Eq. 13.33, accordingly, takes the following form: 


yi-£ s (i) vg 
< 2 * vg-4 0 




/(*) 

1,2 
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This quadratic equation has, obviously, two easily found distinct solutions: 



+ 


v ^) ± W(^ +v w) 2 




(13.34) 


If the diagonal elements of the perturbation matrix vanish, as you already know 
might happen quite often, Eq. 13.34 simplifies: 


= ± 




(13.35) 


The main result revealed by Eqs. 13.34 and 13.35 is that a generic perturbation 
lifts the degeneracy of the initial eigenvalue E ^ and splits it into two new distinct 
eigenvalues 


E — E^ 4- 
1,2 — ' 


\ (<1 + Vii) ± \ J(v$ + v?' >) 


+ 4 


V 


where I set 6 = 1. Even though Eq. 13.33 has been formally derived in the first order 
of the perturbation theory, do not let this circumstance deceive you: its dependence 
on the perturbation matrix elements is not linear. Actually, it is not even analytic 
as the square root in it cannot be expanded into any kind of a power series around 
the point where the perturbation turns zero. The dependence of eigenvalues on the 
perturbation is not linear even in the case of simplified Eq. 13.35, regardless of how 
“linear” this expression might look like: an absolute value of a complex number 
involves the square root as well. 

Having found the corrections to energy, I can go back to Eq. 13.32 and find the 
corresponding coefficients ax. In the case of double-degenerate eigenvalues, I shall 
be looking for two sets of the coefficients—one for each of the corrections to the 

energy. To distinguish between them, I will add the upper indexes 1 and 2 to ax , so 

(1 2 ) 

that the notation becomes a\ ’ .To make algebra less cumbersome so that you could 

focus on what is really important, I will compute af ,T} only for the case V^j = 

( 4 ) 

V 2 2 — 0 . Equation 13.32 for a double-degenerate eigenvalue with zero diagonal 
elements of the perturbation matrix takes the form 

V$a 2 = E^a 1 

Vgai = £ S (1 V 


Substituting = 



into the above equations, I obtain 
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Presenting the perturbation matrix element as V\ 


(s) 


V 


exp (i<pv) and recalling 


that V\ 
form 


0 ) _ !/(*)* 


, you can see that both equations above can be reduced to the same 


a 

a 


(i) 


l 


(i) 

2 


e i(t>v . 


(13.36) 


The second solution of the secular equation E ^ = 

Eq. 13.36, 



yields, instead of 


( 2 ) 

2 A_ - _ e i<Pv 

„( 2 ) • 


(13.37) 


To simplify the situation even further, let me assume that the matrix elements V\ 2 
are real and positive (</>y = 0), in which case the two sets of coefficients become 
simply a[ l) = and — —af\ As it is typical for the eigenvalue problems, one 
of the coefficients remains undefined and is found, again, using the normalization 


condition 


,( 1 , 2 ) 


+ 


,(1,2) 


= 1. This yields 


.(D - - 


1/V2 


and 


a® = - 4 2) = 1/V2. 

Now Eq. 13.28 yields for me two linearly independent orthogonal and normalized 
“correct” zero-order eigenvectors: 

\xi 1) )= 2=(J S ,1) (0) + M )(°)) (13.38) 

| X ( 2) >= -L(| S ,1) (0) -| S ,2) (0) ). (13.39) 

It might appear that no traces of the perturbation matrix are left in Eqs. 13.38 
and 13.39, but this appearance is deceptive. While it is true that the magnitude of 
the perturbation matrix elements has vanished, you must remember that I made a 
very specific assumption about their phases. A different choice of the phase would 
have resulted in different eigenvector combinations as you will see yourself when 
working through the problems section of this chapter. Now the meaning of the 
statement that the perturbation affects the zero-order eigenvectors even though its 
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magnitude goes to zero becomes clear: the transition to the zero perturbation limit 
occurs in such a way that keeps the phase of the perturbation matrix elements intact. 

Finally, I would like to make an almost trivial but nevertheless tremendously 
important point: the off-diagonal matrix elements of the perturbation operator V 
built with “correct” zero-order eigenvectors, Eqs. 13.38 and 13.39, vanish. Indeed, 
by direct computation you may see that 

[x?\v\x? ) ) = \ ( (0) Ml + (0) MI) V (V, i} (0) - M> (0) ) = 

2 (4:1 - <i)=o 

where I took into account assumptions that went into Eqs. 13.38 and 13.39 (V^ = 

(4) (si 

0, V\ 2 = V 2 |). Of course, there is no surprise here—after all I defined 

eigenvectors of V, and the matrix of any operator in the basis of its eigenvector is 
diagonal. Still, it is helpful (for intuition development) to see the concrete realization 
of this abstract statement in this particular context. In a more general situation, I can 
state that vectors 


# 2) 


as 


I44 = E41V 0) ’ a '}’ ( 13 - 4 °) 

A' 


where coefficients a^\ satisfying Eq. 13.32 for a given eigenvalue A, diagonalize 
the perturbation operator. This means that 

{xf ] \ V\x<? ) ) = B? ) Sx,v (13.41) 


There is also another, rather abstract, but nonetheless important side to this 
story. Eigenvectors belonging to a degenerate eigenvalue form what mathematicians 
would call a subspace in a larger space spanned by all eigenvectors of the 
Hamiltonian. It means that any linear combination of the degenerate eigenvectors 
is again an eigenvector belonging to the same eigenvalue (see also Sect. 3.2.3). 
The degree of degeneracy determines the dimension of this subspace. For instance, 
in the example of a double-degenerate eigenvalue, we ended up with a two- 
dimensional subspace of the eigenvectors defined by the basis of two linearly 


independent orthogonal vectors 


xi ]> ) and 


4 2) 


In the matrix representation of the 


total Hamiltonian H , one can separate out a submatrix of smaller dimension, defined 
on a subspace of eigenvectors belonging to the same eigenvalue. If one chooses as 
a basis in this subspace vectors defined by Eq. 13.40, then the resulting submatrix 
\ — [x^ H will have only diagonal elements: 




(13.42) 
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This conclusion is directly based on Eq. 13.41 and on the understanding that 
any mutually orthogonal combinations of the degenerate eigenvectors |s, A) (0) are 
eigenvectors of and, therefore, turn it into a diagonal matrix. This result can be 
interpreted by saying that vectors defined by Eq. 13.40 are eigenvectors not only of 
the perturbation operator but of the entire Hamiltonian, if it is considered only in the 
subspace formed by degenerate eigenvectors. Another way to express the same idea 
is to say that finding the first-order corrections to degenerate energy eigenvalues 
is equivalent to exact diagonalization of the entire Hamiltonian performed on the 
subspace of the degenerate eigenvectors. I will illustrate these points with a simple 
example. 

Example 30 (Diagonalization of a Hamiltonian on a Degenerate Subspace) Con¬ 
sider a Hamiltonian defined by matrix 


H = 


0 e lie 
€ 1 6 
- 2/6 6 1 


Presenting it in the form H 0 + V, where H 0 is the diagonal matrix, find the 
“correct” zero-order eigenvectors belonging to a double-degenerate eigenvalue 1 
and corresponding first-order eigenvalues. 

Solution 

The unperturbed Hamiltonian Hq in this example is 


H 0 = 


0 0 0 ' 
0 1 0 
00 1 


while the perturbation matrix can be written down as 


V = € 


' 0 1 2 f 
1 0 1 
-2 i 1 0 


It is obvious that the unperturbed Hamiltonian has two eigenvalues: 0 and 1, with 
the latter being double degenerate. Corresponding eigenvectors are 



" 0 " 


" 0 " 

1 . 1 ) = 

1 

. | 1 . 2 ) = 

0 


0 


1 


Now, using these two vectors, I need to build matrix , where the upper index 
points to the eigenvalue that I am dealing with. 
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vg = (l,l|V|l,l) = 



0 1 2 i 

"0" 


T 

<? [0 1 0 ] 

1 0 1 

1 

= [ 010 ] 

0 


-2 i 1 0 

0 


1 


= <i,imi,2> = 


e [0 1 0] 

" 0 1 2 f 

1 0 1 

"0" 

0 

= e [0 1 0] 

2 i 

1 


_—2i 1 0_ 

_ 1 _ 


_ 0 _ 


= (1,2| V\\,2) = 


" 0 1 2 f 

"0" 


2 i 

1 0 1 
_—2i 1 0_ 

0 

_ 1 _ 

= € [0 0 l] 

1 

_0_ 


Thus, the perturbation matrix in the subspace belonging to the degenerate eigenvalue 
takes the following form: 




o 1 
1 0 


Eigenvalues v 1,2 of this matrix are found from equation 


v~ — e = 0, 


which yields V \2 = ±6. The respective eigenvectors are found from equations 


€ 

"0 f 

d\ 

= ±€ 

a\ 


_! 0_ 

(22 _ 


_Cl2_ 


d2 = ±<21 


so that in the normalized form they can be written down as 
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I will complete this exercise by generating a 2 x 2 matrix using eigenvectors 
belonging to the degenerate eigenvalue out of the entire Hamiltonian, just to make 
the connection with abstract arguments in the preceding paragraph. Here is how it 
goes: 


//g = <l,l|£|l,l) = 


[0 10 ] 


0 

6 

6 2/6 

1 6 

" 0 " 

1 

= [0 10 ] 

6 

1 

2/6 

6 1 _ 

_ 0 _ 


_ 6 _ 


= l. 


+ = <u|£|i,2> = 


[0 10 ] 

0 6 2/6 

6 1 6 


" 0 " 

0 

= [0 10 ] 

2/6 

6 


2/6 6 1 


_ 1 _ 


_ 1 _ 


H% = (l,2\H\l,2) = 


[0 0 1 ] 

1 

fl \ O 

rr\ 

- 

_ 1 

" 0 " 

0 

= [00 1 ] 

2/6 

6 


- 2/6 6 1 

1 


1 


Putting all these elements together, I end up with a matrix 


_6 lj ’ 

which as you can see is, indeed, a submatrix of the total Hamiltonian formed by 
cutting out the block formed by elements in the second and third rows and columns. 
If I rewrite this matrix in a new basis formed by the “correct” eigenvectors (see 
Eq. 5.97) , I will have 


"1 1 " 

"1 €~ 

"1 1 " 

_! -1_ 

_6 1_ 

_! -1_ 


"1 1 " 

"1 +e 1—e" 

_! -1_ 

_1 +e -1 +e_ 


1+6 0 
0 1-6 


1 0 
0 1 


+ 


6 0 
0 -6 
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in agreement with Eq. 13.42. Here I used the rule for transformation of an operator 
from one basis to another as discussed in Sect. 5.2.2. 

I will complete the discussion of the secular equation by noting that solving 
Eq. 13.33 amounts to finding the roots of a polynomial of A-th order, where A is the 
degree of degeneracy of the corresponding eigenvalue. The fundamental theorem 
of algebra states that the polynomial of A-th order has A solutions, but does not 
exclude a possibility that some solutions can occur several times. The fact that 
Eq. 13.33 is an eigenvalue equation of a Hermitian operator ensures that all roots 
of the corresponding polynomial are real. If all these solutions differ from each 
other, we say that the perturbation completely lifts the degeneracy, but there might 
be situations in which the removal of the degeneracy is only partial, and some of the 
new eigenvalues still remain degenerate. 

Now I am ready to proceed and start looking for corrections to the degenerate 
eigenvectors, for which I have to go back to Eq. 13.30 and rewrite it in terms of 


one of the newly found zero-order eigenvectors 



corresponding to a particular 


eigenvalue A of the perturbation matrix: 


H 0 |s (1) , A) + V |*W) = \s m ,X) + l/W). 


If I now premultiply this equation by | 


(A) 

Xs 


, I will get, taking into account Eq. 13.41, 


7(0 


y: 


(0 


XX 


the identity 


as expected, and if I premultiply it by | 


xP 


, where /x ^ A, I will get 


{x^\s^,X)E^ =(x^\s^,X)E^ 


(remember that eigenvalues of Ho do not depend on /x, and V^ t x — 0). Now, finally, 

, which are eigenvectors of the perturbation operator 


let me premultiply it by | 


yOO 

Xq 


V in the subspace defined by a different (possibly degenerate) eigenvalue q ^ s: 

(xp\ * (i uk o) +' v.soa = (<? (o) i * (i) ,A)£f =► 

VqH,sX 


(4d * (1) .4 


e? } - e: 


tor 


where V qiliS x = [ 


xP 


V 


X?* 


j is the perturbation matrix element computed with 

the zero-order eigenvectors obeying Eq. 13.40. Expanding |s(0,A) in the basis of 
the “correct” zero-order eigenvectors, I will find 
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’ (, U) = ££MK“VV} = 

/x q 


EE 14X41 * (1) ’ A > + E I4}(*4I* (1) >4 = 

l 1 q^s /r 


EE 

V q^s 


Vq/i,sX 



«+ 


£■* 


Parameters 


s^\X) in this expression are not yet defined (the fact 


routinely swept under the rug in way too many textbooks and online notes on this 
topic). A similar situation also occurred in the non-degenerate case, but then we had 
to deal with only one undefined parameter, and now their number is equal to the 
degree of degeneracy. The total eigenvector with accuracy up to the linear order in 
6 is 


k - 14) (i+4u) + E 


EE 

i 1 


V, 


q/i,sX 


7 ( 0 ) 


7 ( 0 ) 


14)- < 13 - 43) 


Normalization condition for this vector yields, just like in the non-degenerate 
case: 


(s,X | s, X) 


1 ~\~ CL 


is) 

X,X 


= 1 


(all coefficients a^\ vanish because vectors and with \i ^ X are 
orthogonal), from which you can safely conclude that a\\ = 0. But what about 

all other still undefined coefficients a^ x l The normalization condition turned 
out to be useless in this regard, but I have one more trick up my sleeve. To 
avoid unnecessary complications, I will assume that the perturbation completely 
lifts the initial degeneracy of the eigenvalue so that all are different. All 
eigenvectors of Hermitian operators belonging to different eigenvalues must be 
mutually orthogonal. So, let me consider the inner product | Taking into 
account all available orthogonality conditions, I find 

y i a > = e a i\ (4 y) i x ?)=e a t\ 8 ^= a v \=°- 

/x^A /x^A 

Now, when I have found all coefficients in Eq. 13.43, I can write down the final 
expression for 15, A): 


s A) = I4) + EE 




q[i,sX 


& <1^* 


77 ( 0 ) _ 77 ( 0 ) 


14) 


(13.44) 
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which is cited in most textbooks but often under a false pretense. It is important to 
understand that condition q ^ s excludes all the terms with the same eigenvalue of 
the energy, so no issues with a denominator of this expression going to zero never 
arise. Often cited fact that the perturbation matrix elements V q/JLfS0 x turn zero for 
q = s and /t ^ A, thanks to the special choice of the zero-order eigenvectors, while 
factually true, has very little to do with the justification of Eq. 13.44, contrary to 
what you might have read. 


13.2.2 Linear Stark Effect 

While the Stark effect observed for the non-degenerate ground state of the hydrogen 
atom is quadratic in field, the response of degenerate energy levels of a hydrogen 
atom to a uniform electric field can actually be linear. To illustrate the origin 
of this effect, I will consider the first-order corrections to the degenerate energy 
level of a hydrogen atom characterized by the principal quantum number n > 1 
and eigenvectors | n, /, m). The subspace of the degenerate eigenvectors consists of 
all eigenvectors with fixed principal number n but varying orbital and magnetic 
numbers / < n and — / < m < /. The total number of such degenerate states and, 
therefore, the dimensionality of the subspace are, as you know, n 2 . The perturbation 
operator is given again by Eq. 13.14, so that the perturbation matrix, limited to the 
degenerate subspace, is 


Vnl\m\,nl2m2 — (jll\TTl\ \ Z \nl 2 Wl 2 ) • 

Using the arguments similar to the ones I used when analyzing matrix elements for 
the non-degenerate case, I can show that this matrix is diagonal with respect to the 
magnetic number m : 


Vnl\m\,nl 2 tn 2 — (nl\TTli \ Z \fll 2 Wll) $m\m 2 m 

(In case you forgot, the argument goes like this: in the position representation, 
ket \nl 2 m 2 ) contains a factor exp ( irri 2 (p ), and bra \ contributes exp (—im\cp). 

Since there are no other factors dependent on the azimuthal angle, integration of 
exp[/(m 2 — m\)(p] over the entire range of (p from 0 to 2n yields zero unless 
m 2 = m\.) Another general property of the matrix element \ z \nl 2 m\) can 

be established by looking at its behavior with respect to the inversion operator. As 
I have already mentioned more than once, if a Hamiltonian is invariant with respect 
to inversion, the matrix elements computed with its eigenvectors must be even to 
have a non-zero value. Now, the matrix element (nl\m\ \ z \nl 2 m 1 ) consists of three 
components: an operator 2, which is odd (changes sign) upon inversion, ket vector 
\nl 2 m 1 ), and bra vector | that transforms upon inversion according to 


P\nl 2 m\) = (—1) /2 \nl 2 m \); (nl\m\\P = (— \) lx (nl\m\ \ . 
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The transformation rule for the entire matrix element therefore becomes 

P (nlimi | 2 \nl 2 tn 1 ) = — (— l ) /l+/2 \ z lft/ 2 ^ 1 ) • 

You can clearly see now that for the matrix element to remain invariant under 
inversion and, therefore, to have a non-zero value, orbital numbers l\ and I 2 must 
have opposite parities: if one is even, the other one must be odd. Obviously, all 
matrix elements with l\ = / 2 vanish. Actually, it can be shown that the restriction 
on the values of orbital numbers l\ and I 2 is even more stringent: the matrix element 
(nl\mi\z \nl 2 m 1 ) is different from zero only if |/i — h\ = 1 . 

Having figured out the general properties of the perturbation matrix elements, 
I can now turn to finding the first-order corrections to the energy eigenvalues. 
Unfortunately, I cannot get too far in the general case of arbitrary n because what 
I would have to do is to search for roots on the polynomial of the order n 2 , and 
there are no methods of doing this analytically for an arbitrary n. Besides, I do 
not really need it. To illustrate the effects of the electric field on a degenerate energy 
eigenvalue, it is sufficient to consider an eigenvalue with the lowest nontrivial degree 
of degeneracy, which is the eigenvalue with n = 2. There are four states belonging 
to the same energy value E^ = —E g /A\ |2,0,0), |2,1, —1), |2,1,0), and |2,1,1) 
to which I will assign numbers 1,2,3, and 4 correspondingly, so that I can label the 
perturbation matrix with only two indexes, V-- , where i and j take values from one 
to four, and as a reminder, the upper index refers to the principal quantum number 
n — 2. For instance, the matrix element V\ 2 corresponds to (2,0,0| V|2,1, —1), 

V\ 3 corresponds to (2,0,0| V |2,1,0), and so on and so forth. The task of solving 
Eq. 13.33 for a 4 x 4 matrix might also appear to be too formidable; after all it 
involves finding roots of the polynomial of the fourth order, for which, in general, 
there are no analytic formulas. But if you do not succumb to an immediate panic 
attack, you may find out that in this particular case, nature prepares for us a nice 
surprise. Using the general properties of the matrix element established in the 
previous paragraph (mi = m 2 , \l\ — h\ = 1 ), you can easily find that the only 
non-zero matrix elements of the perturbation matrix are 

vfl = vf\* = -e£ (2001 Z |210). (13.45) 

All other matrix elements involving 1 = 0 state |2,0,0) vanish because of m\ = m 2 
selection rule, while all matrix elements between states with / = 1 vanish because 
of the rule \l\ — I 2 \ = 1. This is a huge simplification of our task because the secular 
equation for the eigenvalue corrections now takes the form 
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Here I computed the 4 x 4 determinant using cofactor expansion first along the last 
row of the initial determinant, which contains a single non-zero term, and then along 
the second row of the remaining 3x3 determinant. The resulting equation has four 
solutions, as expected, two of which coincide and are equal to zero. The remaining 


two solutions are 4 = 


V 


( 2 ) 


1,3 


. What it means is that now, instead of a single 
initial fourfold degenerate eigenvalue, we have three distinct eigenvalues, one of 
which is double degenerate and is equal to the zero-order value: 


E 2 ; 1,2 = El 


( 0 ) 
'2 ’ 


E 2 ; 3,4 = ± 


V 


( 2 ) 

1,3 


( 2 ) 

To complete this calculation, I would need to evaluate the matrix element V\ 3 . The 
most straightforward path toward this goal is to use the position representation for 
the hydrogen eigenvectors, which I, for your convenience, present below (saving 
you a few minutes you would have to spend googling these formulas on your own): 


1200 ) 




Substituting these expressions in Eq. 13.45, 1 get 


- ) cos 9. 




dO sin 9 cos 2 9 , 
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where I expressed z in spherical coordinates as r cos 6 ; introduced spherical volume 
elements r 2 sin OdrdOdcp; carried out trivial integration over cp, which yielded 2 tt; 
and separated integrals with respect to angular and radial variables. The angular 
integral is computed easily by a standard substitution of variables v = cos 6 and 
dx = — sin OdO and yields 


l 


/ 

-l 


dxx 2 


2 

3' 


All what is left now is to do the radial integral: 

0 

To turn the quite straightforward while tedious and boring job of computing the 
remaining integral into something a bit more interesting and fun, I challenge you to 
extract as much physical information from this expression without actually doing the 
integration. The trick is (and this is the kind of trick, which professional physicists 
learn to do as a pure reflexive action) to replace the integration variable r with 
something dimensionless. In the integral in question, such a substitution is almost 
obvious: y = Zr/a B , after which the integral becomes 

oo 

vfl = J dx {^~ i) ex P (—^) 

0 


revealing that the matrix element is proportional to the drop of the potential energy 
of the electron in the external field over the size of the atom exemplified by Bohr 
radius a B /Z. The remaining integral contributes only a numerical factor, which I 
will compute using computational platform Mathematica © with the result 


V 


( 2 ) 

1,3 



Now I can give the explicit expressions for the energy eigenvalues modified by the 
electric field: 


^2; 1,2 = —Eg/ 4 

E 2;3 ,4 = -Eg/ 4 =b 3 e£-l. (13.46) 

The Stark effect in the case of degenerate energy eigenvalues differs from its 
non-degenerate counterpart in two important aspects. First, the corrections to the 
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a j 


. / r - (| 2,1,0) +12,0.0)) 

= -^;|2.0.0),|2,l,0).|2,l,±l) / ^ j * 


t -r 


T 


! £5"=- 

— r~ ' 4 


£f a) = -£,;|UQ.O) 

Fig. 13.2 Splitting of the n = 2 degenerate energy level of a hydrogen atom 


eigenvalues are now linear in the field (hence the name—linear Stark effect). 
Second, instead of a simple shift of the energy, two new energy eigenvalues emerge 
above and below of the initial level. Experimentally this effect is seen as splitting of 
a single emission or absorption spectral line corresponding to a transition between 
n = 1 ground state and all n = 2 states into three distinct lines, the distance 
between which grows linearly with the increasing electric field. This phenomenon is 
illustrated in Fig. 13.2, where the left panel shows the electric field-induced changes 
in the energy spectrum of the atom together with possible quantum transitions 
responsible for the absorption of light, while the right panel illustrates the field- 
induced linear divergence of the emergent energy eigenvalues. 

The next stop of our linear Stark effect train is the “correct” zero-order 
eigenvectors found by solving Eq. 13.32, which, taking into account the structure 
of the perturbation matrix, can be written as 


0 0 vfl 0 


~Cl\- 


~a{~ 

0 0 0 0 


a 2 

— /A 1 ) 

a 2 

0 

0 

0 

CN ^ 


a 3 


a 3 

0 

0 

0 

0 

_1 


_a 4_ 




( 2 ) 

As far as the value of the matrix element V 13 is concerned, all what I need to 
know about it at this point is that it is real. The resulting matrix equation yields 
the following system of equations: 


t /( 2 ) _ 77 ( 1 ) 

Vl 3 ^3 — ^2 ^1 

0 = E ( Pa 2 
Vfl*a x = E^a 3 
0 = E$pa 


(13.47) 
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which need to be solved for each of the found eigenvalues. Let’s first dispose of the 
trivial case E 2 = 0 , which reduces this system to 

v[ 2 ) 3 a 3 = o 

0 = 0 

V[ 2 l*ai = 0 
0 = 0 . 


The first and the third equations obviously yield = a\ = 0, while the second 
and the fourth equations at first glance do not tell much—they are just trivial 
identities. What they are actually saying, however, is that coefficients a 2 and a 4 
remain undefined and can be chosen at will. It is also important to remember that 0 
is a degenerate eigenvalue, so that we have to choose two pairs of these coefficients, 
and we want to do in such a way that the resulting eigenvectors would be orthogonal 
to each other. Actually, this requirement is quite easy to fulfill by choosing for one 
pair = 1, = 0 and for the other — 0, a= 1. With this choice we end 

up with two trivial and easily anticipated answers (check out the enumeration of the 
eigenvectors I introduced before these calculations were started!): 

|2} (1) = |2,1,-1}; |2} (2) = \2, 1,1). (13.48) 


It is not surprising that eigenvalues unaffected by the perturbation are accompanied 
by eigenvectors from the initial zero-order set. 


Substituting the third eigenvalue E\. 


(i) 

2;3 


y(2) 

1,3 


= vfl into Eq. 13.47, 1 have 


t/(2) _ t/(2) 

M,3^3 — M,3^1 


0 

/( 2 ) 




= y, (2 3«3 


(13.49) 


0 = Vgfl4. 


These equations yield af^ = af^ = 0 and a 3 = cq. The respective eigenvector 
becomes 

|2) (3) = -k (12,0,0) + 12,1,0}). (13.50) 

V2 

where I set a 3 = a\ = 1/ V2 for normalization purposes. I will let you have a bit of 
fun with the last eigenvalue £^4 = — vf 3 and provide only the final answer for the 
last eigenvector 


| 2 ) (4) = A ( | 2 ,o, 0 )-| 2 , 1 , 0 ». 

V 2 


(13.51) 
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13.3 Problems 
Problems to Sect. 13.1 

Problem 155 An electron in a one-dimensional infinite potential well 

0 0 < z < d 

oo z < 0, z > d 

is also subjected to an electric field also in the positive z direction. 

1. Write down analytically and sketch the total potential energy of the electron in 
the presence of the field. 

2. Find first the non-zero corrections to the electron’s energy eigenvalues due to the 
field. For the ground state of the electron, evaluate the three first terms in the 
appropriate sum and give a numerical estimate for the corrections assuming that 
d = 100 nm and the electric field is £ — 10 6 V/m. 

3. Obtain the first-order correction to the ground state wave function of the electron. 

4. Determine the expectation value of the electron’s coordinate in the presence 
of the field, and compare it to the expectation value without the field. Give a 
qualitative explanation of the result. 

5. Where in the well the probability to find the electron is the largest? 

Problem 156 Using the stationary perturbation theory, find first the nonvanishing 
corrections to the energy eigenvalues and eigenvectors of an electron moving in a 
one-dimensional harmonic oscillator potential Vo = m e co 1 2 z 2 /2 due to perturbation 
potentials of two different kinds: 

1 . V(z) = r]z 3 

2. V{z) = gz 4 

Do not use the position representation to compute the matrix elements, instead use 
the ladder operator formalism. 

Problem 157 Consider an electron interacting with a positively charged ion via a 
harmonic oscillator potential Vo = m e co 2 ^ / 2 , where the origin of the coordinate 
system is chosen at the position of a positively charged particle assumed unmov¬ 
able. 

1. Find the corrections to the electron’s energy eigenvalues due to a uniform electric 
field in the positive z direction. 

2. Find the polarizability of the electron, assuming it is in the ground state. 

Do not use the position representation to compute the matrix elements, instead use 
the ladder operator formalism. 
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Problem 158 An electron in an infinite potential well of width d (see Problem 1) 
is also subjected to a perturbation potential V p = aS(x — d/ 2). Find the corrections 
to the energy eigenvalue up to the second order of the perturbation theory and the 
corrections to the eigenvectors up to the first order. 

Problem 159 The potential energy of an electron is given by the following 
expression: 


V(r) = 


e 



+ B 


X 2 +}’ 2 

r 3 


where the last two terms can be considered as a perturbation. Using non-degenerate 
perturbation theory, find the first-order corrections to the energy and the wave 
function of the ground state of the electron in a unperturbed Coulomb potential. 

Problem 160 Consider a particle in the infinite potential well 


, x 0 

Vo(z) = 

(oo 

also subjected to a perturbation of the 

Vi (z) = 


( - 2)z1 

_ ) ^ d 


'0 


— d/2 < z < dll 
z < —d/2, z > d/2 

following form: 

— d/2 < z < d/2 
z < -d/2, z > d/2. 


Derive the expressions for the first non-zero corrections to the ground state energy 
in the infinite potential well and its corresponding wave function. 

Problem 161 Consider a three-dimensional isotropic harmonic oscillator subjected 
to the following perturbation potential: 

V = k(xy + xz + zy). 


Find first the nonvanishing correction to the ground state energy of the oscillator and 
the first-order correction to its wave function. Hint: It is easier to do this problem in 
the Cartesian coordinates. 


Problems to Sect. 13.2 

Problem 162 Find the “correct” zero-order wave functions for a double-degenerate 
energy level if the matrix elements of perturbation potential are 


V\2 — |v| (1 + 0- 
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Problem 163 Consider a system described by the following Hamiltonian matrix: 


— 1 ie 

26 

e(l-0' 

—ie —1 

—6 

6 

26 —6 

2 

lie 

_6 (1 + /) 6 

—lie 

1 


where e is a small real parameter, 6^1. 

1. Present this Hamiltonian in the form H 0 + V where H 0 is a diagonal matrix. 

2. You shall see that H 0 has two pairs of double-degenerate eigenvalues. Using 
degenerate perturbation theory, find the first-order corrections to both eigenvalues 
and the “correct” zero-order eigenvectors. 

3. Find the first-order corrections to the eigenvectors and the second-order correc¬ 
tions to the eigenvalues. 

Problem 164 Using degenerate perturbation theory, find the corrections to the 

energy of n = 2 hydrogen energy level due to perturbation of the form 


V = A zx. 


Problem 165 Using the degenerate perturbation theory, find the first-order correc¬ 
tions to the energy and “correct” zero-order eigenvectors for a first excited state of 
the system described in Problem 161. 




Chapter 14 

Fine Structure of the Hydrogen Spectra 
and Zeeman Effect 


Check for 
updates 


14.1 Spin-Orbit Interaction and Fine Structure 
of the Energy Spectrum of Hydrogen 

14.1.1 Spin-Orbit Contribution to the Hamiltonian 

You might still have a vague recollection of me mentioning the spin-orbit coupling 
in Sect. 9.5.1, where I introduced the tensor product of spin and orbital spaces as 
a means to construct vectors representing both orbital and spin components of a 
quantum state (if you do not remember that, you would do yourself a favor by going 
back and rereading that part of the book). More specifically, the issue of spin-orbit 
coupling came up in the discussion of generic vectors in the tensor product space, 
which could be presented as a superposition of basis vectors, in which different 
spin states are paired with different orbital components. Such states can be called 
spin-orbit coupled because the orbital properties of a system in such a state can be 
changed by affecting its spin and vice versa. However, practically, such states can 
only be realized in systems with actual spin-orbit interaction contributing a special 
term containing a combination of spin and orbital operators to their energy and, 
correspondingly, quantum Hamiltonian. This interaction is quite common. It appears 
in many ordinary systems, such as atoms or semiconductors, and is responsible for 
a number of important phenomena. In atoms it gives rise to the spectral features 
known in the early days of quantum mechanics, while in semiconductors it brings 
about the relatively recently discovered effects allowing, for instance, to use the 
spin to control electron spatial flow. Combining spin and orbital phenomena in 
such nontrivial situations is never a simple task, even if merely because it doubles 
the number of equations that must be solved. At the same time, the phenomena 
resulting from the spin-orbit interaction are way too important to be simply ignored 
and shall be discussed even if you only start getting comfortable with intricacies of 
the quantum description of the world. Therefore, in this section, I am giving you 
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a chance to learn about some aspects of the spin-orbit interaction in a relatively 
non-threatening environment by considering, again, a simple model of a single 
electron in a hydrogen-like atom. 

Before going into the details of the physical mechanism responsible for the spin- 
orbit interaction in this system, let me try to guess its general operator structure 
using simple symmetry arguments. I will begin by making quite a trivial remark 
that the energy term, whatever it might be, must be a scalar. This scalar, however, 
must include the spin which is described by vector, S', and another vector operator 
(or a combination thereof) related to the orbital spatial-temporal behavior of the 
electron. There are only three vectors that could foot the bill: the position vector 
r, the momentum vector p , and the orbital momentum vector L. All these three 
operators are indeed vectors, but not all vectors are created equal. Momentum and 
position are, as you already know, odd operators, while the orbital momentum is 
even. This distinction is true for quantum operators as well as for classical vectors: 
for this reason, the angular momentum and other vectors having the same property 
with respect to inversion are sometimes called “pseudo-vectors.” Other examples of 
classical “pseudo-vectors” are magnetic moments and magnetic field; in general 
anything that can be produced from normal vectors or related to them via the 
operation of vector (cross) product is a pseudo-vector. 

Since spin operators do not have a classical analog, it is impossible to use this 
criterion to determine how they would behave with respect to inversion. Luckily, 
there are other ways to make this determination. One can, for instance, look at 
the potential energy associated with the spin magnetic moment, Eq. 9.28. Since we 
know that the potential energy is invariant with respect to inversion (or any other 
transformation of coordinates for that matter), it immediately follows that the spin 
operator must have the same parity as the magnetic field, which is even. Reversing 
this argument, I can say that in order to produce an expression invariant with respect 
to inversion, one must combine the spin operator with another even operator, which 
excludes position and momentum operators, leaving the angular momentum as the 
only viable alternative. 

Thus, I hope these arguments have convinced you that the spin-orbit term in the 
Hamiltonian must involve the dot product of spin and orbital momentum operators: 

H so = XL • S, (14.1) 

where a proportionality constant A cannot be determined without going into details 
of the actual physical mechanism underlying the spin-orbit interaction, and this is 
what I am going to do now. The origin of the spin-orbit coupling in atoms is best 
understood by looking at the electron-nucleus interaction from the point of view of 
a moving electron. In the electron’s reference frame, the nucleus is seen as a positive 
electric charge Ze (Z is the atomic number of the respective element) orbiting the 
electron (which is at the center of the orbit) moving with the orbital speed of the 
electron, but in the opposite direction. From electrodynamics you must know that 
a moving electric charge creates a magnetic field B. Ignoring for now the fact that 
the charge is moving along a circular trajectory (I will get back to this point later), I 
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can use the solution of the Maxwell equations for a charge moving at constant speed 
valid in the weakly relativistic limit v c 1 : 

1 

B = — -vxE, (14.2) 

c 2 

where v is the velocity of the electron and E is the electric field of the charge. The 
negative sign in Eq. 14.2 reflects the fact that the velocity of the proton, which is 
supposed to appear in this expression, is opposite to that of the electron. In the limit 
u «c, the electric field is given by a standard Coulomb expression: 


E = 


1 Ze 
47t£o£ r r 2 r 


(14.3) 


where e r is the unit vector in the radial direction toward the electron and all other 
notations correspond to Chap. 8. Substitution of Eq. 14.3 into Eq. 14.2 yields for the 
magnetic field 


B = - 


1 


Ze 


4jt£o£ r c 2 r 2 


[v x e r ] = — 


1 


Zem P 


4jt£o£ r c 2 m e r 3 


[v x r] 


1 Ze 
4t T£ 0 £ r m e c 2 H L 


where I replaced unit vector e r with the position vector of the electron relative 
to proton e r = r/r, multiplied the numerator and the denominator by electron 
mass m e , changed the order of multiplication in the cross product [v xr], and 
finally replaced m e [r x v] with the orbital momentum L. The direction of L in this 
derivation determines the direction of the magnetic field, B. This fact can be also 
confirmed using the right-hand rule as it is clear from Fig. 14.1. 

Finally, recalling Eq. 9.7 for the magnetic moment of the electron as well as the 
expression for the potential energy of the magnetic dipole in the magnetic field 
(U = —/jl s • B ), you can find for the spin-orbit interaction energy: 


Hu 


1 


Ze 2 


4Tt£o£ r m 2 c 2 R 3 


L S. 


(14.4) 


The presented derivation of Eq. 14.4 suffers from one significant shortcoming: it 
is based on Eq. 14.2 derived for a particle moving at constant velocity, while the 
motion of the electron in an atom is nothing but uniform. For the same reason, an 
electron is hardly an inertial reference frame, which makes the transition to it, used 
in the derivation, also quite suspicious. The problems with this formula, however, 


^ou might need to refresh your memory of classical electrodynamics at this point using the 
Internet or one of the available undergraduate textbooks on electrodynamics. 
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Fig. 14.1 (a) A hydrogen atom in the reference frame of the nucleus. An electron is orbiting the 

nucleus in the counterclockwise direction; vectors v and r represent velocity and position vector 
of the electron relative to the proton, so that its angular momentum L points out of the page, (b) 
A hydrogen atom in the reference frame of electron. The nucleus is traversing an orbit with an 
electron in its center in the clockwise direction creating a magnetic field at the electron’s location 
also directed toward the viewer. Vectors v and r show directions of the differential current element 
Idl in the Biot-Savart law and position of the point of the observation of the magnetic field relative 
to the local current element. Magnetic field B is also directed out of the page as prescribed by the 
cross product Idl x e r 


were first noticed by experimentalists who found that theoretical predictions of the 
hydrogen spectra based on Eq. 14.4 deviate from the experimental results and that 
the theory can be brought into better agreement with measurements if one divides 
the expression in Eq. 14.4 by two. This discrepancy remained a mystery until a 
British physicist Llewellyn Hilleth Thomas 2 in 1926 re-derived the expression for 
the spin-orbit energy by carefully applying Lorentz transformations of the special 
theory of relativity when transitioning to the electron’s reference frame. The key 
to the correct solution was to take into account the rotation of the reference frame 
attached to the electron as it traverses its trajectory around the nucleus. It is really 
quite amazing that these calculations yielded a spin-orbit energy differing from 
Eq. 14.4 exactly by factor 1/2 as required to bring the theory in an agreement with 
the experiments. The kinematic effect due to the rotating reference frame, which 
actually has nothing to do with quantum mechanics and arises every time when 
one has to deal with rotating reference frames, is called Thomas precession , and the 
corresponding factor 1/2 is called Thomas's half. Taking Thomas’s half into account 
yields the final expression for the spin-orbit contribution to the Hamiltonian 


H s< 


1 Ze 2 
87T£o£ r m 2 c 2 r 3 


L S. 


(14.5) 


2 Llewellyn Hilleth Thomas was a British physicist who eventually moved to the USA, where 
he held a professorial position in Ohio State University, was a member of the Watson Scientific 

Computing Laboratory at Columbia University, and was the IBM First Fellow in the Watson 
Research Center. His last position was at North Carolina State University. 
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This expression agrees with our symmetry based on Eq. 14.1, where the undefined 
parameter A can now be identified as 


A(r) = 


1 Ze 2 
8 tt 8{)£ r m 2 e c 2 r 3 


and turns out to be a function of the radial coordinate of the electron. 


(14.6) 


14.1.2 Schrodinger Equation with Spin-Orbit Term 


Having found the contribution of the spin-orbit interaction to the electron’s 
Hamiltonian, I can attempt to write down the Schrodinger equation in the 
coordinate-spinor representation. Using representation of the spin matrices derived 
in Sect. 5.2.3, 1 can present operator product L • S as 

L S = L X S X + LySy + L-S- = 


\ u r 

, i 

tl 

i 

'J^> 

■ i 

_i 

_L X + iL y 

1 

~ 2 

[L+ —L z \ 


where in the last step I replaced v and y components of the angular momentum 

operator with the ladder operators L± introduced in Eqs. 3.59 and 3.60. Now, 

the Schrodinger equation for the components 4^ (r) and T^(r) of the spinor |/), 
Eq. 9.62, can be written down as 

Horb^lir) + \i z 4^(/-) + L_'l4(r)J = E^^r), (14.8) 

Horb^iir) + A^ [L+* t (r) - L^(r)] = £^(r). (14.9) 

Thanks to the spin-orbit term, Eqs. 14.8 and 14.9 become interdependent, which 
means that the spin and orbital states no longer can be chosen independently of 
each other. 

Since the spin-orbit coupling parameter A depends on the radial coordinate, it is 
probably hopeless to try solving these equations exactly, but I still want to see how 
far I can go trying, even if just to satisfy my curiosity. On a more serious note, I do 
believe that playing a bit with these equations might provide a better understanding 
of the correlation between the orbital and spin states generated by the spin-orbit 
coupling. 

My first step would be naturally to try to separate the radial and angular variables. 
Since both the H or i> and spin-orbit terms in Eqs. 14.8 and 14.9 contain operators 
of the angular momentum, I can try to look for the solution in the form (r) = 
Rf (^7^(0, cp ), where the first factor is yet an unknown radial function and Y™(6, cp) 
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is the familiar spherical harmonic. However, inspecting the term containing tF|(r) 
in Eq. 14.8, I notice that it appears in the combination with the lowering ladder 
operator L_. This means that if I choose (r) with the angular dependence given by 
the same spherical harmonics Y™(0,(p), I will be in trouble because L-Y™(0, cp) (X 
ym-i ($, (p), and I would not be able to satisfy the equation (the angular parts of the 
wave functions would not cancel). However, if I do not panic, it might occur to me 
that by choosing ^(r) as T^(r) = R^(r)Yf +x (Q, (p), I can actually save the day 
because L_F z m+1 (0, cp) a TJ”(0, <p) so that the angular dependence in all terms of 
the equation will be given by TJ”(0, cp), which I will be able to cancel leaving only 
the unknown radial components of the wave functions. For this to work, however, 
the same choices for 4^(r) and T^( r ) must do the same trick for Eq. 14.9. A bit 
of inspection will convince you that it actually does: L+F™(0, (p) a F™ +1 (0, cp) , 
which coincides with the angular dependence of all other terms containing ^(r). 

To give these arguments a more polished and technical form, let me first recall 
from Chap. 8, Eq. 8.7 that 


Herb [R(r)Y'”] 


h 2 


2/zr 2 dr 




ti 2 l(l + l) 
2/zr 2 


R 


1 Ze 2 
4tt £ r £o r 


-R 


(see Chap. 8 for all notations). I will rewrite this expression introducing the radial 
orbital operator H r as 


H orb [R{r)Y™] = Y™H r R(r), 


where 

1 Ze 1 
4n£ r £o r 

I also need to recall that 


fi 2 d ( 2 d \ fi 2 l(l + 1) 
2jir 2 dr \ dr) + 2jir 2 


L Z Y™(6, cp) = hmY?(0,<p); 

L-Y™ +1 (8, (p) = tiJUJ+ l)-m(m + i)Y^(9,<p); 
L+Y^(d,(p) = ft >//(/ + 1) - m(m + 1)F“ +1 (0, <p). 


Then Eq. 14.8 takes the form 


Y?H r R x (r) + —X mR f (r)Y™(e,<p)+ 


—\VW+ \)-m(m + V)Ri(r)Y?(9,<p) = ER^r)Y™(d,<p). 
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The similar procedure for Eq. 14.9 yields 

Y? +1 H r R±(r) - yA (m + 1 )R i {r)Y™ + \0,cp) + 

yAV/(/+ l)-m(m+ i)R f (r)Y'” + \9,(p) = ER±{r)Y?+ l {9,<p). 

Now you can see that the spherical harmonics can, indeed, be canceled in both 
equations so that I am left with a system of purely radial equations 


(^Hr + ^«4,rn,f( r ) + 


yA y/l(l + 1) - m(m + 1 )/?„,/,m+u( r ) = ER n,Um,\(r), 
fl 1 


-yA(ffl + 1)^ m+l,|( r ) + 


yA y/lQ + 1) - m(m + l)R„j, m ,f(r) = ER nMm+u (r), 


(14.10) 


(14.11) 


where I added additional indexes to the radial function to emphasize that it depends 
upon the four quantum numbers: the radial number n , which will emerge once the 
eigenvalues of energy are found from these equations, the orbital and magnetic 
numbers l and m, and finally the spin number, which in this case is designated by an 
arrow pointing up or down. 

I am afraid that is it—I reached the end of this path as I am not going to attempt 
solving Eqs. 14.10 and 14.11. Now let’s take a breath and try to understand if I 
can learn anything from this exercise. Leaving aside the issue of finding the radial 
functions, I do know now that the state of the atom in the presence of the spin-orbit 
interaction is given by a spinor of the following form: 


\x) 


Rn,l,m,fY?(6,(p) 

_R n ,i, m+U (r)Y™ + \6,<p)_ ’ 


(14.12) 


which can also be presented as a linear combination of basis spinors representing 
eigenvectors of the spin operator S z 


\x) = Rn,i, m A{r)Y?{0,<p) W)+R, hUm+u { r )Y^ + \e, V ) |P . 


(14.13) 
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Even though I do not know the radial functions in this expression, it still provides 

me with some interesting information. For instance, it is clear from Eq. 14.13 that 

-2 

|/) is an eigenvector of the operator L : 

L 2 ID = R„,i, m ,\L 2 Y?{0,<p)\V+R MU (r)L 2 Y? +l (6,<p) ||) = 
h 2 l(l + 1) [R n , UmA Y?{e,q>) |t> + R n , Um+U (r)Y? +x (0,<p)] ||), 


-2 

where I took into account that the L operator does not affect the radial functions or 
spin eigenvectors. At the same time, it is also clear that \/) is not an eigenvector of 
operators L z or S z (I will leave it to you to verify this fact on your own). But there 
is more. Since |/) combines spin and orbital states, it makes complete sense to see 
how this state will be affected by the operators of the total angular momentum J 
introduced in Sect. 9.5.2, beginning with the operator J z = L z + S z . Remembering 
that L z acts only on the orbital components of the state while S z affects only its 
spinor components, I obtain 

(L z + S z ) [R M Y?{0,<p) It) + Rn,Um+ u(r)lf +1 (0, <p) |t)] = 
h + 2^ Rn,l,m,\Y?(6 . <p) It) + 

* ( m + 1 - \) Rn,i, m +u(r)Y7 + \6,<p) It) = 

£ (>» + 2 ) K/,m,t Y?(0,<p) It) + /? n ,;,m+uW^r +1 (0^) ID] • 

Rewriting this result in a more concise form 

J z \x) =h{m+^j ID 

makes it plainly obvious that |/) is an eigenvector of J z with mj = m + 1/2 (see 

Eq. 9.74). At this point it is only natural to check out the relationship between |/) 

-2 

and the operator / . On a hunch, I will compare Eq. 14.13 with Eqs. 9.93 and 9.94 

-2 

representing eigenvectors of J , but to make the comparison easier, I will first 
rewrite Eq. 14.13 in an abstract form replacing index m with rrij — 1/2: 

ID = Rn,Umj-l/ 2 A(r) \l,mj - 1/2) It) + 

Rn,l,mj+\/2,-l( r ) \l,mj + 1/2) It) . (14.14) 
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-2 

Now it is plainly obvious that Eq. 14.14 has the same structure as eigenvectors of J 
\ji,l,mj) and | j' 2 , l,mj ), where j\ andy 2 are defined in Eqs. 9.88 and 9.89, and if I 
define the radial functions as 

RnlmA = K lmj -^= VTTWTl/2 (14.15) 

Rn, m+ U = K lm] y/l-mj + 1/2 (14.16) 

or as 

RnlmA = 1/2 (14.17) 

RnlmA = + + (14.18) 

I can rewrite Eq. 14.14 either as 

hi) = Kimj ly'ih.my) 


for j i = / + 1/2 or as 


\Xl) = R'nlmj \j 2 ,l,mj) 

for 72 = / — 1/2. Thereby I am explicitly demonstrating that with these choices of 

-2 

radial functions, vectors \xi) and |/ 2 ) become eigenvectors of the operator/ . 

If you are wondering why this fact is such a big deal, just take a new look at 

-2 

Eq. 9.72 depicting the structure of the operator J , and you will see that it contains 

the same term L S as the spin-orbit coupling contribution to the Hamiltonian. It does 

-2 

not take much now to demonstrate that J commutes with the entire Hamiltonian, 

including the spin-orbit coupling term, and, therefore, common eigenvectors of 

-2 

both J and J z are also eigenvectors of the Hamiltonian. All what is left now is to 
substitute Eqs. 14.15 and 14.16 or Eqs. 14.17 and 14.18 into Eqs. 14.10 and 14.11, 
replacing along the way m with mj— 1/2 and m + 1 with mj -\- 1/2. First I deal with 
Eq. 14.10: 


{ kr + y A ( Wy 2)) 7 ttt V/+my + 1/2 ^ + 

—X Jl(l + 1) - (m) - - j = yfl - mj + V2 Kimj = 


77 nJ 1 _ 

nM ‘ nlmj V 2 /TT 


yjl + iflj +1/2, 
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where I added indexes to the energy eigenvalue E anticipating quantum num¬ 
bers which this eigenvalue might depend upon. Recalling identity 9.87 to sim¬ 
plify yjl(l + 1) — (mj — I) = yj(l -b 1/2 + mj) (/ + 1/2 — ntj), I determine that 

y/ + m y +l/2/V2 TF 1 is a common factor which can be canceled and that the 
remaining expression takes the following form: 

(h, + yA - f) + yA (/ - + 1/2)) ^ lmj = E nJ ^ lmj =» 

(Hr + yA/) = E nJl ,K lm] . (14.19) 

This last equation surely looks nice and pretty, but before celebrating, we all would 
be well advised to make sure that Eq. 14.11 will be reduced to the same form after 
all these substitutions, because if it did not, I would have had two different equations 
for the same function, and this wouldn’t be too good, right? So, here I go: 

F - t a F + 0) Tim 

j^VKi + 1 ) — m ( m ++ m J + V^K lmj — 

Using the same trick as above, I will now find that + mj + 1/2/ V2/ + 1 is the 
common factor, and after canceling it, I will end up with equation 

[h, - yA (mj + i) + yA (/ + mj + 1)) ^ lmj = E nJul ^ lmj =► 

= E nJ1 x lmj 

which is exactly the same as Eq. 14.19. Now we can celebrate—Eq. 14.19 is, indeed, 
a correct equation for the radial wave function corresponding to j\ = l + 1/2. 
Repeating all these steps with Eqs. 14.17 and 14.18, you can obtain the radial 
equation for R^i mj in the form 

[k r - yA (/ + 1)) = EnjAy (14-20) 

The spin-orbit contribution to Eqs. 14.19 and 14.20 is different for different values 
of j, which justifies adding this number as an index in the notation for energy E n jj. 
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Of course, nothing of this shall come as a big surprise; it is just an additional 

-2 

verification of the fact that eigenvectors of J and J z are, indeed, the eigenvectors of 
the Hamiltonian. 

Trigger warning: what I am going to show you now might get some of you 
really upset. All this work, which we did deriving Eqs. 14.19 and 14.20, was not 
really necessary as all these results can be derived faster and with less effort. So, if 
you feel the righteous indignation about me torturing you unnecessarily, I do have 
something to say in my defense. Now you have a much clearer understanding of 
the nuts and bolts of the spin-orbit interaction and its effect on the eigenvectors of 
the Hamiltonian than you would have had otherwise, and a few extra hours of me 
writing and you reading is a fair price to pay for it. 

So, now let me show the easier way, and I will begin by writing down a complete 
Hamiltonian with the spin-orbit interaction in the form 


2/zr 2 dr y dr) 2/zr 2 


Ze 2 

4jt£o£ r r 


+ AL-S 


where the kinetic energy is presented as a sum of radial and orbital momentum 
terms, and I use the same notations as in Chap. 8. Next I replace L S in the spin- 
orbit interaction with 




using Eq. 9.72, so that the Hamiltonian becomes 

- fi 2 d ( 7 3 \ L 2 Ze 2 1 /~2 -2 

H = — ——- ~r~ ( r —— ) + ——- — - —X (j — L 

2/zr 2 dr \ dr J 2/zr 2 4jt£o£ r r 2 V 

Now acting by Hamiltonian 14.21 on 



\X) = Rj,i\j,l,mj ), 


(14.21) 


(14.22) 


I get 


-1 

tsJ 


2/zr 2 dr ' 

l 3r) 


Ze 2 


2/zr 2 \n£§£ r r ' 2 

fi 2 l(l + 1) 




R jj(r) | j,l,mj) = 


fi 2 d i 


to 

7= 

To 

Q^> 

l dr) 


(j(j 


+ 


+ 1 ) — /(/ + 1 ) 


Ze 2 


fi 2 


2/zr 2 

3 

~ 4 


)]«, 


4 tt £Q£ r r 2 
i(r) | j, l,mj). 
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Respectively, the radial part of the Schrodinger equation becomes 


' h 2 d , 

fr 29 ! 

i 

1 

to 

3b 

Q2> 

1 9rJ 


h 2 l(l + 1) 
2 fir 2 


Ze 2 fi 2 

4-jt £{)£ r r 2 


(j(j 


+ !)_/(/+l)_r. 


)] 


Rj,l(r) = EnjjRj.lir). 


(14.23) 


The first three terms in this equation corresponds to H r in Eqs. 14.19 and 14.20, and 
the last term is the spin-orbit contribution. Now, it is just a matter of simple algebra 
to convince oneself that j(j + 1) — /(/ + 1) — | is equal to l for j = l + 1/2 and to 
—/ — 1 for j = l — 1/2, making Eq. 14.23 equivalent to Eqs. 14.19 and 14.20. 


14.1.3 Fine Structure of the Hydrogen Spectrum 


To determine how spin-orbit interaction affects energy eigenvalues of the electron 
in the hydrogen atom, one would need to solve Eq. 14.23. Unfortunately, since spin- 
orbit coupling parameter A depends on the radial coordinate 1/r 3 , which is different 
from both 1/r term in the Coulomb potential and 1/r 2 term in the orbital angular 
momentum term, the exact analytical solution of this equation is out of reach, so the 
perturbation treatment discussed in Chap. 13 needs to be invoked with the spin-orbit 
term being treated as the perturbation. The first-order correction to the hydrogen 
energy levels is given by the expectation value of the perturbation term calculated 
with unperturbed radial hydrogen wave functions Rfl (r) described in Chap. 8: 


Enj,l = 4 0) + 


1 


Zh 2 e 2 


1 6tT G()£ r /jL 2 C 2 


j(j + 1) — /(/ + 1) — ~ I ( 


where X was replaced with its explicit expression from Eq. 14.6. Using Eq. 8.34 for 
the expectation value 


oo 



0 


which was derived in Sect. 8.3, I find the first-order spin-orbit corrections to the 
energy levels to be 

£ _ r-(O) , 1 Z A ti 2 e 2 j(j + 1) — /(/ + 1) — 4 

" jJ n &ne 0 e r[ x 2 c 2 a\ l (l + 1) (21 + 1) n 3 

(0) (, _ Z 2 ft 2 j(j +!)-/(/+!) 

" ^ fi 2 c 2 a 2 1(1+1) (21 + 1)« ) 


(14.24) 
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Equation 8.34 for (r -3 ) fails for / = 0, but it is not a big concern because the 
spin-orbit interaction vanishes at / = 0 anyway (spin has no one to interact with 
when the orbital momentum disappears; formally it becomes clear by substituting 
j = 1/2 into Eq. 14.23). Accordingly, Eq. 14.24 should be applied only to the states 
with l > 0. 

The dimensionless constant Zfi/(fica B ) in Eq. 14.24 can be presented in several 
equivalent forms. First, replacing the Bohr radius with its expression from Eq. 8.8, 
this constant can be rewritten as 


Zfi Ze 2 Z 

- = -— = —a 

fica B Ajt£o£ r cn £ r 


where I combined all fundamental parameters into the so-called fine-structure 
constant (e r and Z are material parameters and are not fundamental) 


1 


a = 


Ajtcocfi 137.036 


(14.25) 


This is one of the most famous constants in physics which plays a particularly 
important role in quantum electrodynamics appearing as a dimensionless strength of 
interaction between electrons and photons. One of the possible interpretations of this 
constant involves the ratio of the potential energy of the system of two electrons at 
a distance v from each other and the energy of a photon with wavelength A = 2 nx: 


a = 



Introducing this fine-structure constant, Eq. 14.24 can be written down as 


^ nj,l 


?( 0 ) 


( # 2 ;(; +!)-/(/+l)~| \ 

y £ 2 / (/ + 1) (21 + 1) n J 


(14.26) 


Alternatively, one can rewrite the same constant in terms of the zero-order hydrogen 

• 77(0) 

energies E n 


( Zfi \ 2 _ ZV _ 2£f } 2 

\ ficas) 16 tt 2 £ q ^ c 2 ^ 2 fie 2 

where I used the expression for E ^ from Eq. 8.16. Now the spin-orbit correction to 
the energy A E^ can be written down as 


A K 


(so) 

njl 


2 K'] jO+!)-/(/+ l)-;f 

lie 2 H 1(1+ 1) (21 + 1) 


(14.27) 
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Unlike Eq. 14.26, the just-derived expression defines the spin-orbit correction to 
the energy in terms of the ratio of the unperturbed energy and the electron’s rest 
energy fie 2 . Both expressions for the energy predict that the spin-orbit interaction 
lifts the degeneracy of the hydrogen energy levels with respect to the orbital 
quantum number /, and, in addition, each energy level of hydrogen with l ^ 0 splits 
in two levels with different values of the total angular momentum j\ 2 = l =1= 1/2. 

At this point a question should pop up in the head of an attentive reader: why 
is it OK to derive Eq. 14.24 using the non-degenerate perturbation theory when 
everyone knows that all hydrogen energy levels (except of the ground state, of 
course) degenerate? This is a valid question, but the reason why the non-degenerate 
perturbation works here is quite simple: the perturbation matrix (n,j,l,mj\ XL • 
S \nj\ , l\ , m Jx ) is diagonal in the basis of degenerate eigenvectors of the unperturbed 
Hamiltonian formed by all vectors | n,j,l,mj) with a fixed value of the principal 

quantum number n. In other words, by choosing the eigenvectors of the operators 

-2 -2 

L , / , and J z as the zero-order eigenvectors, I automatically chose the “correct” 
basis diagonalizing the perturbation operator. 

While we can be rightfully proud of being able to derive Eq. 14.26, it is a bit 
too early to pop champagne corks. Unfortunately, this result does not describe the 
experimental data correctly signaling that something in our analysis is missing. To 
figure out what it might be, note the presence of the speed of light in the fine- 
structure constant which hints that the spin-orbit interaction might have something 
to do with relativistic effects. After all, in the extreme nonrelativistic limit c oo, 
the spin-orbit correction does vanish. Then the question you should be asking is 
if there are other relativistic effects which we might have missed and which can 
affect the hydrogen spectrum. The answer to this question is, indeed, affirmative: 
there are two more relativistic effects which generate contributions to the energy of 
the same order of magnitude as the spin-orbit interaction and, therefore, need to be 
included into consideration. One of these effects is due to the relativistic correction 
to the electron’s kinetic energy, and the other is the so-called Darwin 3 term, whose 
origin is not that easy to explain without invoking a complete relativistic theory 
of electrons based on Dirac’s equation. However, this term contributes only to the 
energy of l = 0 states, and as luck would have it, it is reproduced correctly by 
a simple sum of the spin-orbit and relativistic kinetic energy contributions to the 
energy. 

The kinetic energy of a relativistic particle, according to Einstein’s theory of 
special relativity, is related to the particle momentum p as 


K-rel — yjl 


p 2 c 2 


m e c 



P 2 


3 Charles Galton Darwin was an English physicist, another grandson of Charles Darwin, the author 
of the evolution theory. He became a director of the National Physics Laboratory in 1938 and 
remained in this position through the World War II participating in the Manhattan Project, where 
he was responsible for coordinating American, British, and Canadian efforts. 
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Since we are interested in small relativistic corrections only, I can expand this 
expression in a power series with respect to (p/m e c) 2 and keep just the three first 
terms in the series 


K re i % m e C 2 + -- 


2m e %mlc 2 


(14.28) 


The first and the second terms in this equation are the rest energy of the electron 
(which is just a constant and can be ignored) and the nonrelativistic kinetic energy. 
The object of our interest is the third term in this expression whose effect on the 
hydrogen energy levels I intend to explore. 

I cannot use the radial equation 14.23 for this purpose because it does not include 
the p 4 term, and if I try to separate it into the radial and angular parts, the result 
would be disastrously cumbersome (just imagine having to square the first two terms 
in Eq. 14.21). Fortunately, I do not have to do it and can work instead with the 
Hamiltonian 



8 p?c 2 


Ze 2 

4tt £o£ r r 




(14.29) 


where the kinetic energy operator remains intact, and in the relativistic correction 
term, I replaced electron mass m e with the reduced mass /z to preserve the 
consistency of the notation. The first thing which is important to realize is that 
vectors \jlmj) are eigenvectors of the operator p 4 , which is obvious because an 
eigenvector of any operator A is automatically an eigenvector of an operator A 2 , 
and 4 we have already seen that \jlmj) is an eigenvector of p 2 . This means that the 
perturbation matrix built on the basis of these vectors is diagonal, and I can again 
use the non-degenerate perturbation theory to find the first-order corrections to the 
energy. The second important point that needs to be made is that the first order of 
the perturbation theory is linear with respect to the perturbation operator. As a result 
the correction to the energy due to the sum of two perturbation operators is equal 
to the sum of the corresponding energy corrections due to each of the perturbation 
separately. Accordingly, the modification of the hydrogen energy levels due to the 
relativistic term in Hamiltonian 14.29 can be written down as 


A E 


,{rel ) _ 

njl ~ 


8 p?c 2 


(nj , /, mj\p | nj, /, m) , 


(14.30) 


4 If it is not obvious for you, here is the proof: assume that A\q) = a q \q). Then AA \q) = 
a q A\q) = a\ \q). 
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where \n,j, /, m) = \j, /, m) are complete zero-order hydrogen wave functions 

written in the basis of the vectors | j, /, m ). In order to compute the matrix element 
in Eq. 14.30, 1 will present the operator p A in it as 


A J7 ^ re 0 _ 

AE njl ~ 


1 

2 lie 1 


-2 -2 

2/z 2/z 


(14.31) 


and will consider the action of the emerging kinetic energy operators to their 
respective ket (on the right) and bra (on the left) vectors. The expression 


7T I nj, l, m) 

2/Z 


is evaluated by converting the time-independent Schrodinger equation 


2/z 


Ze 2 \ 

4jt£o£ r r J 


\n,j,l,m) = E\ i 0) | n,j,l,m) 


into 


t- \n,j,l,m) = (e^ + -i—-) | n,j,l, 

2/z V 4 tt£o £ r r J 


U /, m) . 


Hermitian conjugation of this result yields 

~ 2 / ^2 


{n,j,l,m\ = A 0) + —-) ( n,j,l,m\. 

2/z V 4 tt £Q£ r r J 


4jt£o £ r r y 

Substituting these expressions into Eq. 14.31, 1 get 


A E 


'f = lnMmA ( E ™+<lb) '■ m> = 

f4 0) l 2 E (0) Ze 2 1 

- ~2 t- {n,j,l,ntj\ - \n,j,l,m) 

2iic z /ic z 4nsoS r r 

1 Z 2 e 4 1 

(nj, l,mj\ — | n,j, l,m >. 


2/ic 2 16 tt 2 SqS 2 
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Both expectation values (r -1 ) and (r -2 ) appearing in this expression were computed 
in Sect. 8.3, and substitution of the corresponding expressions from Eqs. 8.29 and 
8.33 yields 


A Z?( re/ ) _ 



2/ic 2 


E ( „ 0) Ze 2 Z 
lie 2 4tt 8Q£ r a B n 2 


1 ZV 2Z 2 1 

2/ic 2 1 6 tt 2 £q£ 2 a 2 B (21 + 1 )ft 3 



2 lie 2 


x 


/ 

1 + 
V 


'2 11 


Z l e 


2jt£o£ r aBn 2 E { f 


( 0 ) 


+ 



\ 

1 


(21 + 1) n 3 



= a 


Z 2 1 

2fL F (0)_L 

£ 2 n 4n 2 


8 n 

2 /+ 1 



(14.32) 


where I simplified the resulting expressions using Eqs. 8.8 and 8.17 for the Bohr 
radius and unperturbed energy of the hydrogen atom, respectively. Combining 
Eqs. 14.27 and 14.32, I obtain the total first-order correction to the energy levels 
of an electron in the hydrogen atom: 


AE n/7 = — 



2ft j(j + 1 )-/(/+ 1) - | 

2Z+ 1 ^ /(/-hi) 


3 

2 


Substituting) = /+1 /2 or) = /—1/2, you can convince yourself that this expression 
can be simplified into 


A £(/*) - _J 

- 


£f 


/XC Z 


2 ft 

7+5 


(14.33) 


for any value of). Surprisingly, but even for / = 0 (/ = 1/2)), this expression 
gives the correct answer including the abovementioned Darwin term, which I did 
not even bother to consider. According to this result, energy levels of the electron 
in a hydrogen-like atom acquire dependence on the total angular momentum via 
the quantum number) but remain independent of the orbital quantum number /. For 
instance, this result predicts that states of the electron with ) = 1/2 originating 
from orbitals with / = 0 and / = 1 would have the same energy, while states with 
j = 1/2, / = 1 and) = 3/2, / = 1 would have distinct energies. This difference 
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between energy levels characterized by the same principal number n and different 
values of j is called the fine structure of the hydrogen atom. 

While agreement between theoretical results predicted by Eq. 14.33 and the 
experiment is relatively good, more careful measurements reveal additional energy 
levels of hydrogen not predicted by this equation. The origin of these even more 
closely located energies, called hyperfine structure, can be traced to interaction 
between spins of electron and the nucleus. A consideration of this effect is outside of 
the scope of this book. Another interesting point related to the spectrum of hydrogen 
is concerned with the abovementioned predictions that both j = 1/2, / = 0 and 
j = 1/2, l = 1 have the same energy. In reality the latter state was found to have 
a slightly lower energy than the former. This difference between these energies is 
called the Lamb shift, and its discovery by Willis Lamb 5 and his graduate student 
Robert Retherford in 1947 was a triumph of experimental physics, for which Lamb 
was awarded the Nobel Prize in Physics in 1955. Lamb’s shift is a stunning example 
of a phenomenon which combines a small magnitude with oversized significance for 
physics—explanation of its origin gave birth to modem quantum electrodynamics 
with its overreaching concepts of renormalization influencing desparate fields from 
phase transitions to the black holes. The calculations of the Lamb shift were first 
carried out by Hans Bethe in 1947, 6 whose paper on the Lamb shift was only 
two pages long and became one of the influential theoretical papers of the after- 
World War II period. Alas, you will have to wait till a more serious graduate level 
course in quantum electrodynamics to be able to appreciate the beauty of the theory 
explaining the Lamb shift. 


14.2 Zeeman Effect 

Changes in atomic spectra in the presence of a strong magnetic field were first 
observed in 1896 by Dutch physicist Pieter Zeeman, whose experiments were based 
on earlier theoretical work of Zeeman’s compatriot Hendrik Antoon Lorentz (who 
also derived the famous Lorentz transformations used by Einstein in his relativity 
theory). Both Zeeman and Lorentz were awarded in 1902 the Nobel Prize in Physics 


5 Willis Lamb was an American experimental physicist who made significant contribution to 
quantum electrodynamics and the field of quantum measurements. He holds professorial positions 
at the University of Oxford and Yale, Columbia and Stanford Universities, and the University of 
Arizona. 

6 Hans Bethe was a German-born physicist who immigrated to the USA in 1935 (only 2 years 
later than Einstein) and became a professor at Cornell University where he worked till his death in 
2005. He won the 1967 Nobel Prize in Physics for his work on the theory explaining the formation 
of chemical elements due to nuclear reaction within stars. During the war, he headed theoretical 
efforts within the Manhattan Project and played a critical role in calculating the critical mass of the 
weapons. After the war he was active in efforts to outlaw testing of nuclear weapons convincing 
Kennedy and Nixon administrations to sign the Partial Nuclear Test Ban Treaty (1963) and the 
Anti-Ballistic Missile Treaty (1972). He also made important contribution in solid-state physics. 
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for this discovery signifying its importance. While initially seen as a broadening of 
a spectral line, it was later found to be a splitting of an initial line into as many as 15 
additional lines. This phenomenon is rightfully known as the Zeeman effect, and I 
have already mentioned it in Chap. 9 trying to provide experimental justifications for 
an electron’s spin. The Zeeman effect proved to play an important role in many areas 
of physics and astrophysics, in addition to quantum mechanics. It is not surprising, 
therefore, that I am, as many other authors of quantum mechanics textbooks, eager 
to devote a significant amount of time to developing a proper quantum description 
of this phenomenon. Since the spin-orbit interaction plays an important role in this 
treatment, now seems like a suitable time to do so. 

When describing the contribution of interaction with a magnetic field to the 
atom’s Hamiltonian, one needs to take into account that an electron in a generic 
quantum state possesses two types of magnetic dipole moments: one, fi L , is related 
to its orbital angular momentum L, and the other, /jl s , is due to the electron’s 
spin. While the corresponding contributions to the Hamiltonian are similar for both 
magnetic moments 


H z = - (ji L + H s ) ■ B, 


the theoretical description of the effect is complicated by the fact that gyromagnetic 
ratios, connecting the magnetic moment with the corresponding angular momentum, 
are different for the orbital angular moment and spin. As it was explained in 
Sect. 9.1, the former is two times larger than the latter (see Eqs. 9.1 and 9.7) so 
that the expression for the magnetic energy contribution to the Hamiltonian can be 
written down as 




(14.34) 


where I choose the axis Z of my coordinate system along the magnetic field B. If the 
gyromagnetic ratios for orbital and spin momentums were the same, the magnetic 
energy would depend only on a z-component of the total angular momentum 
J z so that the Zeeman energy and the entire Hamiltonian would commute with 

the operators J and J z making the analysis as simple as that for the spin-orbit 
interaction. Unfortunately (or fortunately—makes life more interesting), this is 
not the case, and either orbital or spin momentums appear in the Zeeman energy 
separately. For instance, you can replace L z with J z — S z and end up with the 
following expression: 



(14.35) 


Neither vectors \l,m) \m s ) nor \j, /, mj) are eigenvectors of the Hamiltonian H z , and, 
therefore, I cannot simply invoke the first-order non-degenerate perturbation theory 
to find the corresponding corrections to the energy levels. 
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The situation can be simplified in two extreme limits: of very weak or very strong 
magnetic field. “Weakness” or “strength” of the field is meant here in comparison 
with the spin-orbit interaction. In the weak magnetic field, the fine structure of the 
spectrum due to the spin-orbit and relativistic corrections remains a predominant 
feature, while effects due to the magnetic field can be considered as a small 
perturbation. The strong magnetic field means that the Zeeman splitting, which 
we discussed in Sect. 9.1, is the main effect, while the spin-orbit and relativistic 
contributions play role of the small perturbation. I will first consider the small field 
limit. 


14.2.1 Zeeman Effect in the Weak Magnetic Field 

The weakness of the magnetic field means that the Hamiltonian of Eq. 14.35 can 
be treated as a perturbation term, while the atomic Hamiltonian with relativistic 
and spin-orbit contributions define the system of zero-order energy eigenvalues 
and eigenvectors. This point needs to be emphasized here because the zero-order 
eigenvalues determine the degeneracy of the unperturbed spectrum and, therefore, 
determine the type of the perturbation theory one has to use. As you learned in 
Sect. 13.2, the most important quantity in this regard is the perturbation matrix 
built on the basis of the degenerate eigenvectors. If the fine structure defines the 
zero-order eigenvalues given by Eq. 14.33, then I have to deal with the subspace of 
vectors \n,j, /, rrij) with fixed numbers n and j. The first term in Hamiltonian 14.35 
is obviously diagonal in the basis of these states: 


{n,j, l, mj\J z | nj, l', m'j) = tirnj8i/8 mjm y (14.36) 

Thus, the main attention must be paid to the second term. In order to compute the 
matrix element (nj, l,mj\S z | n,j, l', m'j ), I will invoke Eqs. 9.93 and 9.94 expressing 
vectors | nj, l,mj) as linear combinations of vectors | n, l, m) \m s ) (I added the radial 
index n to | n, l, m) in order to include the radial function, but it does not make any 
difference since all perturbation operators do not depend on the radial coordinate). 
Then for j = l + 1 /2,1 have 


S z \n,j,l',mj) = 


V2 7 + l 
Je-mfj+l/2 


yjl' + m'j + 1/2 


n,l',m'j- -)5 z |t) + 


+ - )S Z ||) 


y/r + m'j + 1/2 


2 V2/' + 1 

y/v-m'j+1/2 


n,l',m’j — 2) It) — 
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and 


(n,j,l,mj | S z | n,j,l,m'j) = — [/ + m y + 1/2 - / + my - 1/2] 5, 


2 2Z+ 1 


mjntj 


21 + 1 


mjMj ’ 


where I took into account that spinors describing spin-up and spin-down states 
are orthogonal and so are orbital states | n,l,m) with different values of m and /. 
Similarly for j = l — 1 /2, 


Sz | n,j,t,m'j) = 


+ 1/2 


V2FTIL 

V'Z , + m+ 1/2 


n, r, m 'j--\s z |t> - 


n,l',m'j + - )S Z |4,} 


1/2 


(«,7, /, my1|nj,/',m/) 


2 V 2 /' + 1 

yj l' + m'j +1/2 

t 1 


- -) It) + 


+ ^) It) 


[/-i»y + 1/2 -/-my-1/2]5, 


2 2 /+ 1 

_ ^27TT 5, ’ / ' 5m '""^ 


mjmj 


So, what I see here is that even though vectors | nj, /, m/) are not eigenvectors of 
S z , the matrix ,mj\S z is still diagonal with respect to indexes / 

and mj. The significance of this fact is that now I do not need to diagonalize the 
perturbation matrix and can simply use the non-degenerate perturbation theory to 
find the magnetic field corrections to the fine-structure spectrum. The respective 
energy correction A E n j^ mj is given by 


^ Enjl.mj — ~ 

2 m e 


±_!_y 

2 m e V 21 + 1 ) 


(14.37) 
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where the plus sign corresponds to j = l + 1 /2 and the minus sign refers to j = 
l— 1/2. Both expressions appearing in Eq. 14.37 can be rewritten as a single formula 
with the arbitrary quantum number j as 


A eB * 

A B njl,mj — ~Z rllTlj 

2 m P 


1 + 


j(j + 1) — 1(1 + 1) + 3/4 
2/C J + 1) 


(14.38) 


If you are wondering how I managed to derive Eq. 14.38 from Eq. 14.37, I have 
to admit that in reality, I did nothing of the sort. To derive Eq. 14.38, one would 
have to use more general and sophisticated methods, which would be out of place 
in this book, but you can easily convince yourself that Eqs. 14.38 and 14.37 are 
indeed equivalent to each other. It should be noted that the magnetic field removes 
degeneracy of the energy levels forming the fine structure of the hydrogen spectrum 
not only with respect to the magnetic quantum number mj but also with respect to 
the orbital index /. The expression in the square brackets in Eq. 14.38 is often called 
a Lande g-factor, gjf. 


gj,i = 1 + 


j(j + 1) — /(/ + 1) + 3/4 

2/0 + 1 ) 


(14.39) 


Equation 14.38 must be combined with Eq. 14.33 to get the full expression for the 
hydrogen spectra in the presence of a weak magnetic field: 


E i — E ^ 


E ( n } I 2n 

^ 2 v + \ 


^ + gj.iUBBrnj, 


where I reintroduced the Bohr magneton fi B defined in Eq. 9.5. The magnetic field 
contribution presented in such a form remains very much like the naive formula, 
Eq. 9.6 for the orbital Zeeman effect derived in Sect. 9.1. You can see that the entire 
effect of the spin, and spin-orbit coupling in this limit is reduced to the Lande g- 
factor, gjj. 


14.2.2 Strong Magnetic Field 

In the limit of the strong magnetic field, the Zeeman contribution to the atomic 
Hamiltonian is more important than the spin-orbit and relativistic corrections 
and must be included, therefore, into the zero-order unperturbed Hamiltonian. 
The resulting Hamiltonian is similar to the one considered in Sect. 9.1 —the only 
difference is the presence of the spin contribution ignored in Eq. 9.3. However, the 

presence of the spin term does not change the main property of this Hamiltonian— 

-2 - 

it still commutes with operators L , L z , and S z . Accordingly, vectors \n,l,m) \m s ) 
are still eigenvectors of the Hamiltonian with the Zeeman term, as respective 
eigenvalues are, similarly, to Eq. 9.4: 
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- eB /- 
H a t + -— ( L z + 25 z 
2m. V 


|ft, /, m) \m s ) = ^ 0) + il b B (m + 2m s ) . 


The Zeeman contribution lifts the degeneracy of the hydrogen energy eigenvalues 
with respect to the magnetic and spin quantum numbers m and m s , but the energy 
still does not depend on the orbital number /. The subspace of the degenerate 
eigenvectors is now formed by vectors \n,l,m) \m s ), where all quantum numbers, 
except of /, are fixed. The spin-orbit coupling and relativistic correction are now 
being treated as a perturbation. The latter, which is proportional to p 4 , is obviously 
diagonal in this basis because vectors |Z, m) (note the absence of n in the latter 
signifying that the radial function is not included at this point) are the eigenvectors 
of this operator. Accordingly, the contribution of this term to the energy remains to 
be given by Eq. 14.32 as in the absence of the magnetic field. 

To deal with the spin-orbit term, I shall consider matrix 

(m s \ ( n , /, m\ ^L X S X + L y S y + L Z S Z ^ | n, l', mj \m s ) = 

(n, /, m\ L x | n, l f , mj (m s \ S x \m s ) + (n , /, m| L y | n, l f , m) (m s | S y \m s ) + 

(n, /, m| L z \n, l f , mj (m s \ S z \m s ). 


Since expectation values of operators 5^ in states presented by eigenvectors of S Z9 
which appears in the expression above, are zeroes, this expression is reduced to 

({nts | (n, /, m\L X S X + L y S y + |n, /', m) \m s ) = 

(ft, /, m| L z |ft, l f , m) (m s \ S z \m s ) 

regardless of the value of the orbital number /. The remaining expression is easy to 
evaluate if you remember that again, regardless of /, |ft, /, m) is an eigenvector of L z : 

(m s | (ft, /, m| L • S |ft, mj \m s ) = fi 2 mm s 8i/ . (14.40) 

Equation 14.40 demonstrates that the perturbation matrix defined on the degen¬ 
erate subspace of vectors |ft,/, m)|m 5 ) is diagonal permitting me to use the 
non-degenerate perturbation theory to find the spin-orbit corrections to the energy 
in the first order of the perturbation theory. It yields 


Enimms = E ^ + \i b B (m + 2 m s ) + mm, 

E + \i b B (m + 2m s ) + mm, 

Z 2 a 2 


1 Ze 2 fi 2 I 1 


8n£o£ r [l 2 C 2 \r 3 
Z A e 2 fi 2 2 


E? (l - 


2mm s 


87t£o£ r a 3 B /jl 2 c 2 l (l + 1) (21 + 1) ft 3 
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/(/+ 1)(2/+ l)ft y 


+ P'bE (m + 2 m s ) . 
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where I again introduced the fine-structure constant a. Including the relativistic 
correction from Eq. 14.32, 1 get 



IIbB (m + 2 m s ) . 


(14.41) 


The main difference with the limit of the weak field considered in Sect. 14.2.1 
is that now the energy eigenvalues are determined by orbital and spin quantum 
numbers rather than by the total angular momentum numbers j and nij. 

In a hand-waving way, one can say that the strong magnetic field “breaks” the 
coupling between the orbital and spin angular momenta forcing them to precess 
independently around the direction of the field; while in the case of the weak field, 
the spin and orbital momenta remain coupled and precess together with the total 
angular momentum J. 


14.2.3 Intermediate Magnetic Field 

Between the two extremes of very weak and very strong magnetic field, there lies a 
terrain of moderate fields, which is the most difficult for exploration. In this regime 
the spin-orbit, relativistic, and Zeeman contributions to the Hamiltonian are of the 
same order and must be all treated as a perturbation. The zero-order Hamiltonian 
in this case is just an atomic Hamiltonian with zero-order eigenvalues E^ given by 
a standard Bohr formula, Eq. 8.16. The degenerate subspace in this case is defined 
solely by a principal quantum number n. The dimension of this subspace is 2 n 2 ( n 2 
without the spin component), and the basis in this subspace can be formed either 
by vectors | n, l, m) \m s ) with all allowed for a given n values of /, m, and m s or by 
vectors \n,j, /, mj) with again all allowed values of j, /, and mj. Neither of these bases 
diagonalizes all three perturbation operators in the degenerate subspace: the Zeeman 
term is diagonal only in the basis | n,l,m) \m s ), the spin-orbit Hamiltonian—only in 
the basis \n,j, l,mj), and only the relativistic correction is diagonal in both bases. 

A thoughtful reader at this point might say: “Wait a minute! In the weak 
field regime we used vectors | n,j,l,mj) which were not the eigenvectors of the 
Zeeman term, but the Zeeman Hamiltonian turned out to be diagonal in this basis 
nonetheless. What has changed upon transition from the weak field to the moderate 
field regimes?” The answer to this question lies in understanding the structure of 
the degenerate subspace, which is different in the weak and moderate cases. If 
you remember, in the weak field case, the zero-order energy depended on the total 
angular momentum number /, which restricted the degenerate subspace to vectors 
with the same j and different / and mj. And if you keep j the same, the Zeeman 
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energy indeed becomes diagonal with respect to the remaining indexes l, mj. The 
situation changes when the zero-order energy remains degenerate with respect to 
j, throwing eigenvectors with different j into the game. It is with respect to these 
vectors that the Zeeman Hamiltonian becomes nondiagonal complicating my and 
your life substantially. 

Such a significant increase in the dimensionality of the degenerate subspace 
makes finding the first-order corrections to the energy analytically for an arbitrary n 
impossible, so I will have to limit my consideration to the simplest nontrivial case of 
the energy eigenvalue characterized by the principal quantum number n = 2 (case 
n = 1 is trivial since it only allows / = 0 and has only trivial degeneracy with 
respect to the spin index m s ). It is, of course, not as satisfactory as being able to 
derive general expressions for arbitrary values of all quantum numbers, but as they 
say, a bird in the hand is worth two in the bush, so let’s get what we can. 

There are eight degenerate states belonging to the energy eigenvalue = 
—E^'/ 4, where E^ is the absolute value of energy of the ground state. I choose 
to use eigenvectors of the total angular momentum as a basis, so these states are 


| 1 > = 

14 ) = 




2 , 


15 ) = 


1 

2 , - 

2 



| 3 ) = 
16 ) = 



V) 




(14.42) 


To enumerate these states, I introduced for them a simplified notation | i), where 
i = 1,2, • • • 8, but it shall be understood that this numeration is quite arbitrary and 
is needed only to index the elements of the 8 x 8 perturbation matrix. The total 
perturbation operator in this case consists of three terms: the relativistic correction, 
Eq. 14.28, the spin-orbit correction, Eq. 14.5, and the Zeeman term, Eq. 14.35. The 
first two of these and the J z portion of the Zeeman energy are diagonal in the basis 
of vectors given in Eq. 14.42. The respective diagonal elements (expectation values, 
really) have been calculated using the same basis for the relativistic correction in 
Eq. 14.32 and for spin-orbit Hamiltonian in Eqs. 14.26 and 14.27, and their sum, 
which is perfectly suitable for our goals here, was found in Eq. 14.33. Combining 
Eq. 14.33 with the results for J z presented in Eq. 14.36, 1 can write for the complete 
diagonal portion of the perturbation matrix, Hp d e : 



[sf ] 2 

flC 2 



- I + \i B Bmj 


(14.43) 
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where i and j correspond to the numeration scheme introduced in Eq. 14.42. For 
instance, the element i = j = 1 corresponds to quantum numbers n = 2, j = 1/2, 
/ = 0, and mj = — 1/2, so that the respective matrix element is 




2 /xc 2 


- 


(14.44) 


Similarly you can find all other diagonal matrix elements of Hp^n (I will leave the 
derivation to you as an exercise): 
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/xc z 
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2 /xc 2 


1 

-/x fi £. 


(14.45) 


Now, let me turn to a more interesting task of finding the matrix of the operator S z 
in the same basis. To this end I will again invoke the representation of | nj, /, mj) 
vectors in terms of \n, /, m) \m s ) vectors using Clebsch-Gordan expansion, Eqs. 9.93 
and 9.94. Specializing these equations first to the specific case of vectors |1), |2), 
13), and |4) defined in Eq. 14.42, 1 find that one of the two terms in Eq. 9.93 or 9.94 
vanishes for all these vectors leaving me with 


|1) = |2,0,0)||), \2) = |2,0,0)|t), (14.46) 

|3> = 11,-1)11), |4) = |1,1> |t>. (14.47) 


The situation becomes more interesting for vectors |5) — |8). Vectors 15) and |8) are 
characterized by j = 1/2,1= 1, and mj = Tl/2 and are obtained, therefore, from 
Eq. 9.94 
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(14.48) 

|8) = -E (|1,0) It) - V2|l, 1) It)) , 

(14.49) 


while vectors |6) and |7) correspond to j = 3/2, / = 1, and mj = Tl/2, and for 
them Eq. 9.93 yields 


16) = 2=(|l,-l)|t> + V2|l,0m>), (14.50) 

l?> = -2= (V 2 |l, 0 ) It) + | 1 , 1 > ir>) - (14.51) 

The first thing to notice is that vectors 11) — |4) are eigenvectors of the operator S z , 
and, therefore, all nondiagonal matrix elements involving these vectors vanish. In 
other words it means that the first four columns of matrix (X)^- consist of all zeroes 
with an exception of the elements (X) /7 :- Since S z is a Hermitian operator, the same 
is true for the first four rows as well. To give you an idea about how to compute the 
diagonal elements in these rows/columns, I compute (S z ) n : 

(Sz)u = Ul (2,0.01^12,0,0)14,) = U\S z \i) = - h - 

and leave it up to you to show that other diagonal elements are (S z ) 22 = (S z ) 44 — 
h/ 2, while (S z ) 33 = —fi/2. The only nondiagonal elements of (S z ) i j can be found 
in columns 5 through 8 formed by matrix elements (z| S z |5), (i\ S z |6), (i\ S z |7), and 
(i\ S z 18) and the respective rows. I will begin with S z |5), which is easily computed 
to be 


Sz\S) = ^=(V2|l,-l)|t) + |l,0)||)). 

This vector is clearly orthogonal to vectors |1) through14) as well as to |7) and |8) 
because they all contain basis vectors characterized by quantum numbers, among 
which at least one (/, m, or m s ) is different from the respective numbers appearing in 
vector 15). The only non-zero matrix elements in the fifth column are (6| S z |5) and 
(5|.V,|5): 


(6| S z |5) = h - ((t| (1,-1| + V2(i\ (1,0|) (V2|l,-1) |t) + |1,0) |t)) = X 
(5| S z 1 5) = h - (V2(t| (1,-1| - (tl <1,0|) (V2|l,-1) |t) + 11,0) |t)) = Ui. 
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Obviously (5\ S z |6) = (6| S z |5), and by the same token, the only other non-zero 
elements are 


(6| S z |6> = <8|S Z |8>-(7|S Z |7) = Ui 

(7| $z |8) = (8| S z |7> = 

Now let me put all these matrix elements together into the respective matrix 
representing the contribution of spin j^-S z to the perturbation: 


S (Sz) " ~ ' 1rB 


-± 0 0 0 0 0 0 0 
0 ± 0 0 0 0 0 0 
0 0-jo 0 0 0 0 
0 0 0 ± 0 0 0 0 
OOOOi^O 0 

0 0 0 0 ^ -i 0 0 
00000 0 \ ^ 
.0 0 0 0 0 0 


(14.52) 


While the idea of dealing with a 8 x 8 matrix might be terrifying, the matrix 
appearing in Eq. 14.52 has a very special structure, in which small groups of 
elements along its main diagonal are surrounded by zeroes from all sides and are, 
thereby, separated from other elements. Such groups in the first four rows and 
columns consist just of single elements on the main diagonal, then you see the 
group of four elements in the sixth and seventh rows and the columns with the 
same numbers, which appear as 2 x 2 matrix surrounded by zeroes on all sides, and, 
finally, the similar structure appears in the lower right corner of the matrix. Matrix 
having such structure are called block-diagonal, and what makes them special is that 
each block can be considered independently of the others and treated accordingly. 
To see what it means, imagine that this matrix is multiplied by a column with 
eight elements labeled as a t with i changing from one to eight. In the resulting 
column, the first four elements will appear just by themselves, not mixed with any 
other elements, elements as and will only mix with each other , and the same 
is true for elements a 1 and a%. Now if this resulting column is equated to another 
column to form a system of equations, the entire system will disintegrate into four 
single independent equations, and a pair of the systems of equations contains only 
two coupled coefficients. Apparently, instead of having to solve a system of eight 
equations in eight variables, one is left with four independent equations of a single 
variable and two pairs of the equations involving only two variables each. 7 As a 


7 Gosh, I am repeating myself, aren’t I? I said exactly the same words in Chap. 9, but well, it is all 
for your benefit. 
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result, if initially an eight-variable problem might appear quite insurmountable for 
an analytical solution, the block-diagonal structure of the matrix makes it easily 
solvable even by a high school student. The matrix H^ eri) representing the entire 
perturbation with relativistic, spin-orbit, and Zeeman terms included is obtained by 
adding a diagonal matrix described in Eq. 14.43 through Eq. 14.45 with Eq. 14.52: 


jr(peri) _ 

H ij ~ 


-5c S o — € Z 0 0 

o —£so + £z 0 
0 0 - 2e z 

0 0 0 

0 0 0 

0 0 0 

0 0 0 

0 0 0 


0 0 0 

0 0 0 

0 0 0 

—€so + 2 6 Z 0 0 

0 —5e so — \ez ~^£z 

0 ~p£z —£so—\tz 

0 0 0 

0 0 0 


0 

0 

0 

0 

0 

0 



0 

0 

0 

0 

0 

0 

V2 _ 

—5e so + \e z 


(14.53) 


where I introduced notations 


€so 



£z — bB 


to fit this matrix on the page. 

The block-diagonal form of the matrix allows me to immediately claim that 
vectors 


2 , 



|2,o,o> in. 


1 i 

2 , - , 0 , - 

2 2 


12.0,0) |t>. 





2,|,l,|\ = |l,l)|t) 


are eigenvectors of the entire perturbation matrix, while the corresponding eigenval¬ 
ues yield four energy levels of the hydrogen with corrections due to relativistic and 
magnetic field effects valid to the first order in the perturbation. These levels split 
off the initial degenerate level characterized by n = 2 hydrogen energy level and are 
distinct for different values of quantum numbers j, l, and mj. Using the same order 
of these numbers as in the designation of the states, I can write 


£ 2 , 1 / 2 , 0,-172 = Ef - 5e so - e z , (14.54) 

Ei, 1 / 2 , 0 , 1/2 = E j 0 -* — e so + ez, (14.55) 

Ei, 3 / 2 , i,-3/2 = E 2 °^ — 5e so — 2ez, (14.56) 

£ 2 , 3 / 2 , 1,372 = £f - + 2e z - (14.57) 
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The next four energy levels splitting off our n = 2 unperturbed value cannot be 
assigned a number j because they originate from the superposition of states with 
different values of this number. They, however, still can be assigned numbers / = 1 
and mj = — 1 / 2 for levels arising from the states in the fifth and sixth rows/columns 
and / = 1 and mj = 1/2 for energies originating from rows/columns 7 and 8 . To 
find these energies, I have to solve two pairs of eigenvalue equations: 


and 


( 1 \ V2 

- I 5e so + -€z I «5 H - 2 ~ €zCl6 — El, sup,\ -1/2^5 

sfl ( 2 \ 

-^-ez«5 — I + -6z I «6 — E 2 ,sup,l,-l/2<*6 


+ -£z ) a l H- —tzaz — E2,sup,\, 1/2^7 


V2 

3 


V2 ( 1 

-^-£za 7 + I — 5Cso + -€z ) ^8 — E>2,sup,\ -1/2^8, 


(14.58) 


(14.59) 


where coefficients <25 and <26 determine the structure of eigenvectors | 2 , sw/?, 1 , — 1 / 2 ): 


|2, swp, 1, —1/2) = <25 |5) + <26 16) = <25 



+ Cl6 



corresponding to eigenvalues, £ 2,^,1 - 1 / 2 , while coefficients < 27 , < 2 g define eigenvec¬ 
tors | 2 , sup, 1 , 1 / 2 ): 


\2, sup, 1,1/2) = <27 |5) + < 2 g 16) = <27 



+ ^8 



where in place of index j in the eigenvectors and the eigenvalues, I placed the 
abbreviation sup reminding that the corresponding vectors represent superposition 
states which are uncertain values of j. The eigenvalues are found as zeroes of the 
determinants 


1 

“1“ 3 ^Z "T E2 t sup,l,— 1/2 3 ~^Z 

3 ^so “1“ 3 ^Z T" E2 f sup,l,—l/2 

for Eq. 14.58 and the determinate 

€ S o — §€z + E 2 ,sup, 1-1/2 ~^ e Z 

— ^Y^Z 5e so — \tz + E2,sup,\-\/2 
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for Eq. 14.59. Solving the corresponding quadratic equations, I find 


Ef -3e JO -| ± J44- \e so ez + ^ (14.60) 


Ef -3e so + y ± ^44 + (14.61) 

In the limit of zero magnetic field, both these expressions together with all other 
energies in Eqs. 14.54 through 14.57 are reduced to only two values —5e so and — e so 
as predicted in Eq. 14.33 for j = 1/2 and j = 3/2 restoring the degeneracy with 
respect to l and mj , characteristic for the fine structure of the hydrogen. The structure 
of Eqs. 14.60 and 14.61 provides the quantitative criterion for the weak field and 
strong field limits discussed in Sects. 14.2.1 and 14.2.2: the former is defined by 
condition € Z ^ £ S o and the latter by the opposite inequality. Expanding Eqs. 14.60 
and 14.61 in a power series with respect to e z /€ so or Cso/^z/ correspondingly, and 
keeping only linear terms, one can reproduce results of the corresponding sections. I 
will let you confirm this fact as an exercise. Coefficients <25 through a 7 are found by 
substituting the energy eigenvalues into the respective equations, but the resulting 
expressions are rather cumbersome and not too informative, so I will refrain from 
showing them here. In the weak and strong field limits, you will be asked to derive 
them as an exercise. 


El^upx^,! — 1/2 — 


El, SUp\ y 2,\,\/2 — 


14.3 Problems 

Problem 166 Derive Eq. 14.20. 

Problem 167 Consider a hydrogen atom in vacuum and evaluate all energy levels 
constituting its fine structure corresponding to the principal quantum number n = 3. 
Which of these energy eigenvalues remain degenerate, and what is the degree of 
degeneracy? Determine the frequencies of light required to observe transitions from 
the ground state of hydrogen to each of these states. 


For Sect. 14.2 

Problem 168 Verify Eq. 14.33. 

Problem 169 Consider the optical spectra associated with transition between 
ground state of the hydrogen atom and the energy levels characterized by n = 3. 
Assume that the observations are conducted in the magnetic field B = 10 ~ 2 T, and 
evaluate the wavelengths of light corresponding to each transition assuming that you 
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are in the regime of the weak magnetic field. Repeat these calculations for magnetic 
field B = 10 2 T using the results derived for the strong field limit. Can you relate 
each of the lines found in the weak field limit to the corresponding lines in the strong 
field limit? 

Problem 170 Verify Eqs. 14.46 and 14.47. 

Problem 171 Verify that vectors 15), |6), |7), and |8) defined in Eqs. 14.46-14.49 
are orthogonal to the vector obtained by applying operator S z to vector |3). 

Problem 172 Demonstrate that Eqs. 14.60 and 14.61 for the energy eigenvalues 
reproduce results obtained in the limits of weak and strong magnetic fields in the 
linear in parameters 6 z/q 0 or approximation, correspondingly. 

Problem 173 Find coefficients as through in Eqs. 14.58 and 14.59 in the weak 
and strong field limits using the expressions for the eigenvalues derived in the 
previous problem. 

Problem 174 Find the energy levels arising due to Zeeman splitting of n = 
3 eigenvalue of the hydrogen in the presence of the spin-orbit and relativistic 
corrections in the intermediate field regime. (Hint: This is a long problem which 
involves having to deal with an 18 x 18 perturbation matrix, which, however, 
hopefully will have a block-diagonal structure allowing this problem to be solved.) 


Chapter 15 

Emission and Absorption of Light 


Check for 
updates 


15.1 Time-Dependent Perturbation Theory 

It is no secret that quantum mechanics grew up to a large extent out of efforts to 
understand emission and absorption spectra of atoms. Indeed, the famous Planck 
distribution was introduced to explain the spectrum of the black-body radiation 
(not to be confused with black hole, of course), and one of Bohr’s postulates dealt 
explicitly with conditions for the emission or absorption of light by atoms. It is 
not surprising, therefore, that one of the first problems studied by Paul Dirac in his 
seminal 1926 paper on “new” quantum mechanics 1 (as opposed to the pre-1925 
“old” quantum theory based on Bohr-Sommerfeld quantization principle) was the 
problem of interaction between light and atoms. This problem belongs to a broad 
class of problems, in which the Hamiltonian of a system can be presented as a 
sum of the “unperturbed” Hamiltonian H 0 and a perturbation V. However, unlike 
perturbations considered in Chap. 13, operator V is now allowed to depend on time 
so that we end up with the problem involving a time-dependent Hamiltonian: 

H = H 0 + V(t)- (15.1) 

A time dependence of the Hamiltonian is a big deal because it does not just 
completely change the way we must approach a problem, it changes the questions 
that must be asked. To some extent this issue has already been discussed in 
Sect. 10.2, where I dealt with a two-level system interacting with a periodic electric 
field and introduced the ideas of quantum transitions and their probabilities that 
replaced eigenvalues and eigenvectors as main objects of study. If you do not 
remember what I am talking about, please, go back to Sect. 10.2 for a brief refresher. 


! PA.M. Dirac, On the theory of quantum mechanics. Proc. R. Soc. A 112 , 661 (1926). 
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The simple model considered in Sect. 10.2 as well as its more generic extension 
presented by Eq. 15.2 makes an important assumption about the perturbation 
operator V(t). It is supposed that the perturbation potential is due to interaction 
with some external objects or fields, which are not a part of the system described 
by Hamiltonian 15.2. Whatever parameters characterizing the perturbation appear 
in V(t ), they are assumed to be fixed by some external conditions and do not 
change due to interaction with the atom. For instance, if the perturbation is an 
electric field of an electromagnetic wave, this field is taken as being emitted by 
an external source (laser, lamp, sun, etc.), and any possible changes in it caused by 
the interaction with the atom are neglected. 2 This assumption is not always valid. 
For instance, if you would want to describe the lasing phenomenon, you would have 
to complement Schrodinger or Heisenberg equations for the atom with Maxwell 
equations for the field that would include the dipole moment of the atoms as a 
source term. In this approach the atom and the electromagnetic field are treated self- 
consistently, as an interacting system with interdependent dynamics, with the only 
difference that light is considered as a classical (not quantum) object. While it is an 
improvement compared to the initial assumption of the independent electromagnetic 
field, it is still not sufficient to describe such effects as the finite laser linewidth 
or spontaneous emission. Dirac understood the shortcomings of this approach very 
well, and, therefore, just a year after publishing the 1926 paper, he produced another 
publication, where he described the electromagnetic field as a set of quantized 
dynamical variables and solved the light-atom interaction problem treating both 
atom and the field quantum mechanically. 3 Unfortunately, even the semiclassical 
theory of atom-light interaction, leave alone its full quantum treatment, is beyond 
the scope of this book, so you will have to wait till you are ready for a more 
advanced quantum mechanics course to learn about Dirac’s derivation of the rate 
of spontaneous emission as well as about the further development of Dirac’s work 
in Wigner-Weisskopf theory of spontaneous emission. You will also have to refer to 
more serious books on laser theory to learn about the Schawlow-Townes formula for 
the fundamental laser linewidth. Here and for now, you are stuck with the simplest 
version of the theory of interaction between quantum systems and light. 

In general, the spectrum of the Hamiltonian Ho in Eq. 15.1 might contain both 
discrete and continuous segments with an infinite number of eigenvalues and 
eigenvectors. When the number of the unperturbed states is increased beyond the 
two, even the rotating wave approximation, which I had to use in Sect. 10.2 to solve 
the two-level model, is not going to help me much. Therefore, several approximation 
schemes have been invented to deal with this problem. One of the most popular of 
them is the time-dependent perturbation theory allowing to determine the evolution 
of a state of a system due to a time-dependent perturbation. Having found the time 


2 If you are wondering what kind of changes the atom can impose on the field, here are two 
examples: (a) absorption by atoms can change its amplitude (or the number of photons if you 
prefer quantum language), and (b) atoms can emit light at the same frequency as the incident field 
resulting in the increase of its amplitude. 

3 P.A.M. Dirac, The quantum theory of the emission and absorption of radiation. Proc. R. Soc. A 
114 , 243 (1927). 
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dependence of the state, you would be able to determine the probability distributions 
and expectation values of any quantity of interest. 

I will begin assuming that I know all eigenvalues E n and eigenvectors \a n ) of the 
unperturbed Hamiltonian H 0 \ 


H 0 \a n ) = E n | a„). (15.2) 

The lower index n here might actually be a composite index consisting of multiple 
elements, such as principal, azimuthal, and magnetic quantum numbers for a 
hydrogen atom, or a set of numbers n x ,n y , n z characterizing the states of a particle in 
a three-dimensional potential well, and may include a spin magnetic number as well. 
Correspondingly all summations over n appearing below imply summations over all 
respective indexes, and all vectors \a n ) are assumed to be normalized and orthogonal 
to each other, even if they belong to the same degenerate energy eigenvalue. Using 
these states as a basis, I can present an arbitrary time-dependent vector | tHO) as 

\^{t)) = Y J Cnit)e- iE ’' t / h \a n ). (15.3) 


Equation 15.3 is a simple generalization of Eq. 10.13 from Sect. 10.2, where I 
have already explained the meaning of coefficients c n and exponential factors 
exp (— iE n t/fi ). It should be noted that Eq. 15.3 implies that all eigenvectors of Hq 
belong to the discrete spectrum. If this is not the case, the sum over n has to be 
complemented by an integral over the continuous eigenvalues. Alternatively, I can 
always turn a continuous spectrum into a quasi-continuous by imposing periodic 
boundary conditions as explained in Sect. 11.5. Substituting this expression to the 
Schrodinger equation 

ifid | VO /dt = H\\j/) 

and taking into account Eqs. 15.1 and 15.2 yield the system of equations for the 
unknown coefficients c n : 

ifi ( E i)e~ iE ^ h I cc n ) + J2 d ^-e- iE * t ' h |a„>] = 

J2cn(t)e- iE ^H 0 \a„) + K) =► 

n n 

nCn(t)e | Oln) + * ^ |«„) = 

n n 

J2EnCn(t)e~ iE ^ h I a„) + J2 c n(t)e- iE " l/h V\a n ) =» 
n n 

ih E ^-e-^ \a n ) = J2cn(t)e- iE ^ h V\cx n ) . 
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Premultiplying the last expression by the bra vector (a m | and using the orthogonality 
of the basis vectors, I generate the following system of equations for coefficients c n \ 

= 53 V ^e ia>mnt Cn, (15.4) 

n 

where co mn , called transition frequency, is defined as 

CO mn = E '" 7 E " ■ (15.5) 

n 

V mn is again the same perturbation matrix element V mn = (a m \ V\a n ), which was 
introduced earlier in Chap. 13. Equation 15.4 is a generalization of the system of 
equations, Eqs. 10.16 and 10.17, to the case of multiple states and an arbitrary 
perturbation potential. 

In the most generic case, this is a system of infinitely many linear differential 
equations of the first order, which is equivalent to the initial Schrodinger equation 
and still cannot be solved exactly. The advantage of this representation over the 
initial abstract form of the Schrodinger equation is that all the complexity of the 
perturbation potential, which is not directly concerned with its time dependence 
(e.g., its dependence on coordinate, momentum, or spin operators), is coded into the 
corresponding matrix elements, which can be computed using any of the available 
presentations for the corresponding operators. To make this argument clearer, 
imagine the original Schrodinger equation in the position representation, when it 
takes the form of a differential equation in partial derivatives describing dependence 
of the wave function upon at least four variables (time plus three coordinates). The 
transition to the equations for the coefficient c n given by Eq. 15.4 eliminates the 
spatial coordinates, which are integrated out and are hidden in the definition of the 
matrix elements V mn . 

The price for this is, of course, that now, instead of one equation, you have to deal 
with a system of infinitely many, albeit much simpler, equations. However, there are 
several tools, which you can use to make the problem manageable. One of them— 
restricting the number of states included in the expansion, Eq. 15.3 (and, hence, the 
number of equations in the system)—has been already demonstrated in Sect. 10.2. 
In principle, if needed, this approach can be extended to include as many states 
as necessary so that the resulting finite system of equations can be solved at least 
numerically with the help of a computer. In this section, however, I will introduce a 
different method of solving Eq. 15.4 based upon the assumption that the perturbation 
operator is, in some sense, small. This method, which was first used in the already 
mentioned famous 1926 paper by P. Dirac, and many times since, is responsible for 
the most important results in quantum theory of light-atom interaction. 

Now, back to Eq. 15.4. To develop the perturbation approach to this equation, 
I will again pull out of a perturbation operator a formal “strength” parameter e: 
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V eV, which I will use for the same bookkeeping purposes as in Chap. 13. Now, 
I will present an arbitrary coefficient c m as a power series of the form: 


C - c (0) + 6C (1) + 6 2 C (2) ■ 


(15.6) 


where the first, zero-order, term reproduces the coefficients corresponding to 
whatever state the system would be in the absence of the perturbation, the second, 
first-order, term introduces corrections linear in the perturbation operator, the 
next one adds corrections quadratic in the perturbation, and so on and so forth. 
Substituting this expression into Eq. 15.4, I generate an equation containing terms 
with various powers of €: 


ifi 


dc\ 


(0) 


dc, 


(i) 


dc® 

+ <r——b 


dt dt dt 

e E V'm»e iaW (4 0) + + e 2 4 2) ''') • 


This equation can only be satisfied for any arbitrary value of e if and only if terms 
with equal powers of € on the left-hand and the right-hand sides of this equation 
are individually equal to each other. Equating these terms yields the following set of 
equations: 


ifi 


dc 


(0) 


dt 


0, 


dc\ 


(i) 


_ ^ ^ o) 

dt ^ 


ifi 


dc i 


(r) 


dt 


= E y ' 


fSy 1 ) 

mn e 5 


(15.7) 


The first equation simply states that the zero-order coefficients c$ do not depend 
on time, the second equation defines the first-order coefficients ci] } in terms of 
c^m , the next equation defines c$ in terms of c $, and each next coefficient Cm is 
defined in terms of coefficients of the preceding order Thus, starting with Cm\ 
which are supposed to be known, one can, in principle, solve each of the equations 
in this sequence and find the coefficients in any order of the perturbation. 
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However, in order to actually proceed with this plan, I need additional information, 
namely, I’ve got to know the state of the system at some predetermined time instant 
to. Mathematically speaking, what I need are initial conditions for the expansion 
coefficients c m , determined by the concrete physical context of the problem. As it 
has already been explained in Sect. 10.2, in order to provide meaningful physical 
interpretation to calculations with time-dependent Hamiltonians, one has to assume 
that the perturbation has a beginning and that it also has to end (doesn’t this 
statement apply to everything in life?). Then the state of the system in the absence 
of the perturbation, characterized by zero-order coefficients Cm\ defines natural 
initial conditions for c m (t)\ c m (to) = . This, in turn, means that all higher- 

order coefficients Cm to c m (t) must vanish at t = to. Strictly speaking both on 
and off times might not always be rigorously defined because switching-on and 
switching-off of the perturbation is never instantaneous. Still, in some situations, 
assigning an exact value to to, which is usually chosen to be to = 0, is justifiable 
and is frequently used in practical calculations. In other instances it might make 
more sense to assume that the perturbation grows gradually from zero and assign 
to = — oo. The particular choice usually depends on the problem at hand, but in what 
follows, unless I explicitly state otherwise, it will be assumed that the perturbation 
is being turned on instantaneously and choose to = 0. 

Now, once the initial conditions for all differential equations in Eq. 15.7 are 
specified, they can be solved quite easily by simple integration of both sides with 
respect to time. The simplest way to incorporate the initial conditions into the 
solution is to write them down as definite integrals with lower limit set to 0 and 
upper limit to t\ 


= const 

(15.8) 

t 

4'V) = k E c » 0) [drV mn (r)e‘^ 

n n 

(15.9) 


4 r) (0 = ( d xV m Ar)e m '' mT c { r' ) (x) (15.10) 

n 0 


(r) 

It is quite obvious that this form of the solution ensures that all Cm with r > 0 vanish 
at t = 0 as required by the initial conditions. Equation 15.9 presents a ready-to-use 
solution for the first-order correction to the time-dependent state vector \ot(t)). As 
an illustration I will also derive the final expression for the second-order correction, 
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but I am not going to make much use of it in what follows. Equation 15.10 adapted 
to r = 1 yields 


C$(f) = f £ / dzV mn (t) e ia>mnX c^ (r). 
n 0 

Substitution of Eq. 15.9 yields 

2 1 r 

c£ 2) W = (jg) EE ( f/ jT / 00 ^ ( r 0 (15.11) 

n p 0 0 

where I made necessary changes in the summation indexes and integration variables 
to be able to write down the expression as a double sum and a double integral. It 
is not too difficult to write down an expression for an arbitrary coefficient c$(t), 
which will obviously include r summations and integrations; I will let you enjoy 
figuring it out on your own as an exercise. 

If we are only interested in the effects of the first order ( linear ) in the perturbation 
potential, Eq. 15.9 contains all the general information we need. Equation 15.11 
obviously describes second order or quadratic in perturbation effects, the next 
term will yield third-order effects, and so on. In the context of the light-matter 
interaction, you can say that Eq. 15.9 describes linear optical effects, while other 
equations correspond to nonlinear phenomena of the second, third, or even higher 
orders. Nonlinear optics is a fascinating field, but it is impossible to embrace the 
unembraceable, so I will have to reluctantly limit the scope of this chapter by the 
first-order corrections only. 

In order to proceed any further, I need to specify the set of initial coefficients c$ 
and the actual time dependence of the perturbation potential. Obviously, there exist 
infinite possibilities with respect to both of these, so I will focus on the situations 
which are most relevant for typical experiments. While lately experimentalists 
learned how to create complex superposition states of atoms, in most practical 
situations, you will deal with an initial state represented by a simple eigenvector of 
the unperturbed Hamiltonian. In this case all coefficients = c m (0) with m ^ mo, 
where \ot m ) is a vector representing the initial state of the system, are equal to zero, 
while Cml = c mo (0) = 1. Consequently, the summation over n in Eq. 15.9 now 
vanishes, and one ends up with the following: 

t 

c m\t) = ^J drV mmo (r) e ia>mm ° r . 
o 


(15.12) 
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Going back to Eq. 15.3, 1 find the first-order approximation for the time-dependent 
state of the system: 


IlKO) = e~ iE ^ tlh \a mo ) + 



(15.13) 


If the perturbation is switched off at t = tf, and a measurement of energy is carried 
out immediately afterward, the probability that the system is found in the stationary 
state \a m ) is given by 





(15.14) 


o 


The double-index notation of the probability is a reminder of the fact that this 
probability is conditional (predicated on a system being initially at the state | ot m )) 
and can be interpreted as a probability of transition between the initial and final 
states. It shall be noted that the requirement that the perturbation ends at t = tf 
shall not be understood too literally. It is sufficient if the act of measurement, 
taking place at this instant, disrupts the “natural” dynamics of the state prescribed 
by the perturbation drastically enough so that the entire evolution begins from the 
very beginning. An example of such a disruption can be the act of spontaneous 
emission accompanied by the transition of the atom to one of the lower energy 
states, which will restart the perturbation-induced dynamics. If the final state \a m ) 
belongs to a non-degenerate eigenvalue of Ho, it is sufficient to measure the spectral 
intensity of the spontaneous emission to identify it. If, however, \a m ) belongs to 
a degenerate eigenvalue, additional characteristics such as polarization, angular 
distribution, dependence on magnetic field, etc. might be required to achieve the 
unique identification of the final state. If, however, you are only interested in the 
probability of a specific energy eigenvalue, you can find it by summing up p mmo 
over all states belonging to a given degenerate energy level. 

There are several important models of the time dependence of the perturbation 
matrix elements V mmo ( t ) which needs to be considered. However, I will begin with 
a simple toy example of a “pulse” perturbation described by the following time 
dependence: 


Vmm{) if) — Vmmo@(f)€ 


( 15 . 15 ) 
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where 0{t) is a step function, which is equal to unity for positive values of the 
argument and vanishes for the negative ones. Substituting Eq. 15.15 into Eq. 15.12, 
I get 
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There are several lessons to be learned from this example. First, the transition 
probability decreases as the spectral distance between the initial and final states 
exemplified by <w m>mo increases. This fact can be used as a justification for limiting 
the number of states included into consideration, in particular, for the two-level 
model studied in Chap. 10. Second, the number of states 8N with appreciable 
transition probabilities is determined by the product of the spectral distance between 


two adjacent states 8co m = \E m — E m -\| = 


\co m 


&)>n— 1 ,n 


and the time scale 

-l 


of the exponential decay of the perturbation to: SN & (8co m to)~. Indeed, as long 
as |oVm 0 | 1/to, the transmission probabilities remain virtually independent of 
co m ^ mo , and the probability starts decreasing only as \co m , mo \ exceeds l/t 0 . 

This result also has another important interpretation. The temporal Fourier 
transform V mmo (co) of the perturbation matrix element gives 


Vmmo (&)) 


V mmp I u 

~ V2 nJ 


i(ot e -t/t 0( j t _ _ V mm 0 1 
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and the corresponding power spectrum | V mmo (< co ) | , which determines relative 
contribution of different frequency component into the perturbation potential, is 
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This is a so-called Lorentzian function, which has its maximum at co = 0 and 
is reduced by a factor of 2 when co = 1 /to. For this reason, l/to, called half¬ 
width at half-maximum (HWHM), can be considered as a measure of the breadth 
of the spectrum of the perturbation. Then condition \co mtm \ l/to means that 
the largest effect the perturbation has on those energy levels, which fall within the 
spectral range of the perturbation potential. Finally, one can see that in the limit 
tf o o, the transition probability becomes proportional to the power spectrum of 
the perturbation matrix elements at frequency co mmo : 



15.2 Fermi’s Golden Rule 


Fermi’s golden rule is one of the most frequently used (and often overused) results 
of the first-order perturbation theory. It appears in several different reincarnations in 
a variety of situations and deserves an entire section devoted to it. It all begins with 
a problem of finding transition probabilities due to a monochromatic perturbation. 

15.2.1 First-Order Transmission Probability in the Case 
of a Monochromatic Perturbation 

While a time dependence in the form of a simple trigonometric function 


F mmo 0) = 2v mmo 6(t)cos£lt 


(15.16) 


is the simplest one, it generates transition probabilities that are not quite trivial to 
interpret, and dealing with them requires certain caution. Matrix elements v mmo in 
Eq. 15.16 are assumed to obey condition v mmo = v* mQ to ensure that the entire 
matrix V mmo (t) is Hermitian. The choice of the cosine function to represent the time 
dependence is absolutely arbitrary; the same results would follow if I chose to deal 
with a sine function instead. The factor 2 is introduced also for convenience and 
amounts to a redefinition of the remaining matrix elements. 

To compute the coefficients cf (t), it is convenient to rewrite the perturbation in 
terms of exponential functions, which results in the following integral: 



o 
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This expression can be rewritten in terms of the so-called sine function defined as 

, N sinx 

sinc(x) = -. (15.17) 

x 

This name is an abbreviation of the full Latin name sinus cardinalis ; it appears 
quite often in the theory of signal processing, information theory, and, as you can 
see, in calculations of transition probabilities with the harmonic perturbation. Its 
main features are the main maximum at x = 0, where it takes value of unity, and 
equidistant zeroes at x n = nn, n = d=l, ±2, • • •. The zeroes separate secondary 
maximums (or minimums) with ever-decreasing absolute values. Among other 
important properties of this function is the following integral: 

oo 

I sinc 2 (x)dx = 7t. (15.18) 

—oo 

It is also useful to take a closer look at a so-called rescaled sine function defined as 
sine (ax) and considered as a function of x. What is interesting about this function is 
that the distance between its zeroes given by 


7r 

X n -\-\ x n — 

a 

goes to zero as parameter a tends to infinity. It means that while the value of the 
function at x = 0 remains unity, the function becomes more and more “compressed” 
around this point as a is increasing (see Fig. 15.1, where I plotted function sine (ax) 
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Fig. 15.1 Function sine (ax) 
for three values of parameter a 



for several values of a). In terms of this function, the expression for the coefficients 
Cm* can be written down as 
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(15.19) 


Properties of the sine function offer us a valuable lesson in connection with 
Eq. 15.19. Since Q + co m m 0 and —Q + co mm o never turn zero for the same 
combinations of £2 and &> mmo , the corresponding terms in this equation are always 
disparate in their magnitudes, and the disparity grows stronger with the passage 
of time. It is clear from the latest remark that I am treating the combination of 
frequencies as the argument of the sine function, while time t plays the role of the 
scaling parameter a. 

The argument of the sine function in the first term of Eq. 15.19 turns zero when 
the external frequency £2 is 


£2 — CO m mo ’ (15.20) 

which is only possible if co mm o is negative because the frequency £2 is defined as a 
positive quantity. Negative co mm o corresponds to E m < E mo , he., to transitions from 
the higher energy state to lower energy states, and authors of many textbooks at 
this point add that it, obviously, corresponds to a transition in which light is being 
emitted. This conclusion is usually justified by appealing to the energy conservation 
principle—if the energy of the atom decreases, it must go somewhere, and if 
the perturbation involved in the transition is an electromagnetic wave, it is quite 
natural to assume that this is where the energy goes. I would like to emphasize, 
however, that the approach used to derive Eq. 15.19 is based on the assumption 
that the perturbation potential is not affected by the state of the atom, so that any 
processes involving emission or absorption of light, rigorously speaking, cannot be 
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described within this approach. Having cautioned you, now I have to admit being 
a bit facetious because as long as an interaction with the electromagnetic field is 
the only interaction the atom can be engaged in, this conclusion is, indeed, true. It 
can be confirmed directly by a complete theory explicitly taking into account the 
absorption and emission of light by the atom. 

The argument of the sine function in the second term of Eq. 15.19 turns zero 
when 


Q 


^mmo 5 


(15.21) 


i.e., for transitions from a lower energy state to the higher energy states. Using again 
energy conservation arguments, you can argue that since the energy required for an 
atom to make this transition can only come from the perturbation, this process must 
correspond to absorption of light if the perturbation is again provided by an incident 
electromagnetic wave. Regardless of which of the conditions, Eq. 15.20 or 15.21, is 
true, you need to keep only one term in Eq. 15.19. Thus, limiting consideration only 
to coefficients for which co mmo > 0 (and if mo is the ground state, it would include 
all the coefficients, since in this case only transitions to higher states are possible), 
we can simplify the expression for c$ ( t ) to 


c^ii) = ^exp 


1 (—£2 + (Ommo) * 


( ( ^ T" ^mmo) 
sine - 

\ 2 J 


which yields the transition probability 


(15.22) 


Pmmo — | C m ^ (0 



(15.23) 


However, the derivation of this formula is not the entire story. The real story here 
is the dependence of the transition probability on time and the external frequency 
detuning Q — a) mmo . When this detuning is equal to zero (people say in this case 
that the perturbation is in resonance with the transition), the probability grows 
quadratically with time. This growth imposes the upper limit on how long the 
perturbation can be allowed to act before the approximation used in this approach 
loses its validity. Using the applicability condition p mmo « 1,1 obtain for the time 
limit 


t f « f 


fi 

I v mmo | 


(15.24) 


It is interesting to compare the expression for t* with Eq. 10.24 for the Rabi 
frequency Q R of a two-level system. With obvious changes in the notations 
(|u mmo | £), Eq. 15.24 can be recast as tf 1 / Q R allowing to claim that the 

first-order perturbation theory remains valid as long as time tf remains much smaller 
than the period of the Rabi oscillations defined by the perturbation matrix element 
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Fig. 15.2 Frequency 
dependence of the probability 
transition for various times tf 





I Vmmo I • The quadratic raise of the resonance transition probability with time is not 
the whole story, again. The second effect occurring simultaneously is the shrinkage 
with time of the interval of frequencies corresponding to the nonvanishing transition 
probabilities. This interval, for instance, can be estimated as the spectral distance 
between the resonance Q r = co mmo and the frequency Q z — co mmo = In It, where 
the sine function becomes equal to zero for the first time. Beyond this point even 
formally non-zero values of the probability are too small to be taken seriously (see 
Fig. 15.2). With increasing t this spectral interval decreases as l/t, so that even 
though the maximum value of the probability grows as t 2 , the area covered by the 
function p mmo (Q — co mmo ) increases only as t ( hight x width). This area is expressed 
mathematically as an integral of the p mmo (Q — co mmo ) with respect to Q — co mmo , and 
while you are scratching your head trying to understand why it even matters to me, 
let me try to explain. Consider the following integral: 

oo 

• 2 ( aX \ 7 

sine y—jdx 

—oo 

where I used Eq. 15.18 and the obvious substitution of variables. This result means 
that 


2tt 
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(15.25) 


*55, Jsmc 2 (f)*) = 1. 

which indicates that function a sine 2 (ax/ 2) / (2tt) in the limit a —> oo behaves 
essentially as Dirac’s delta-function—it approaches infinity as its argument goes 
to zero and vanishes for all other argument values (see Fig. 15.2), and its integral is 
equal to unity: 


lim — sine 2 (—) = <5(v). 

a^oo 27T V 2 / 


(15.26) 
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Identifying now v with Q — co mmo and a with t , I can rewrite Eq. 15.23 in the form 
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Pmmo 


mmo I 


fl 2 


-t8 (£2 %mo) : 


which approximates the transition probability for times tf large enough to neglect 
the values of sine function for all frequencies except of the resonance one, and at the 
same time small enough to still satisfy inequality in Eq. 15.24. Sometimes people 
write these results in terms of energy levels rather than transition frequency, making 
argument of the 8 -function into tiQ — E m + E mo . Taking advantage of the identity 
8(x/a) = a8(x ), you can rewrite the probability transition as 


Pmmo = 27r| ^ mmo1 t8 (ha -E m + E mo ). (15.27) 

The linear dependence of this probability upon time allows introducing a transition 
rate R mmo = p m m 0 /U which expresses the constant (independent of time) probability 
of transition per unit time: 


Rmmo = 27r| ^ mmo1 s (ha -E m + E mo ). (15.28) 

Equation 15.28 is one of the several expressions for what is known in quantum 
mechanics as Fermi’s golden rule. The problem with this result is that despite of 
its fame and years of successful use, it is not that easy to make sense of it. Indeed, 
according to Eq. 15.28, the transitions are possible only if the argument of the delta- 
function is exactly equal to zero—a condition which is experimentally impossible 
to achieve. Even from the theoretical viewpoint, this expression is dubious because 
delta-function as a mathematical construct only makes sense inside of an integral 
with another normal function. You might try to save the situation by assuming 
that the whole business of replacing the sine function with the delta-function was 
wrongheaded, and if we just stayed with the sine function, all problems could have 
been avoided. Unfortunately, this idea does not save the situation for several reasons. 
First, with increasing time, the width of the main maximum of the sine function 
objectively grows too small to have experimental relevance, and the use of the 
delta-function simply emphasizes this real fact (experimentally, starting with some 
time t , the difference between the zero width of the delta-function and finite but 
tiny width of the sine function becomes unnoticeable). Second, avoiding transition 
to the delta-function, you would not be able to reveal the important feature of 
the time dependence of the transition probability, namely, its linear rather than 
quadratic dependence on time, understood in some integrated sense. The transition 
from quadratic time dependence at small t to linear dependence as t grows is not a 
regular crossover from one asymptotic form of a function to another—it is actually 
a transition in interpreting the transition probability from a regular function of 
frequency to what mathematicians call a distribution —a mathematical entity which 
can only have sense when used inside of an integral. The transition from the sine 
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to the delta-function simply illuminates this quite real and experimentally relevant 
change in the nature of the frequency dependence of the transition probability. 

Thus, the only way to make sense of Eq. 15.28 is to find what this transition 
rate can be integrated with. It is quite clear that within the current picture of a 
monochromatic perturbation acting on an atom with a discrete spectrum, this goal 
cannot be achieved, which means that the perturbation theory is not really suitable 
for dealing with such systems. Indeed, you can remember that in the extreme model 
of the atom with only two energy levels solved in Sect. 10.2 without reliance on the 
perturbation theory, we have not encountered any of the problems plaguing us here. 
But do not worry, you have not just wasted an hour of your life reading the lead-up 
to Eq. 15.28. This result is still useful, but you need to learn when and how to use it, 
and this is what I am going to show to you in the next subsections. 


15.2.2 A Non-monochromatic Noncoherent Perturbation 

To justify the starting discussion of the transition probabilities with the simple 
monochromatic (described by sin or cos functions) perturbation, I could remind 
you that any arbitrary time-dependent perturbation can be expended into a linear 
combination of monochromatic functions. This expansion is well known as a Fourier 
series, if the initial function is periodic, or a Fourier integral, if it is not. I will not 
assume periodicity of the perturbation potential and will present it as 

oo 

VnuHoif) = Vmmo J F^e^dSl, (15.29) 

—oo 

where F(Q) is the Fourier transform of the function f(t) prescribing the time 
dependence of the perturbation potential. 4 Since the integration with respect to time 
required by Eq. 15.12 can be carried out independently of the integration over Q, 
I can simply substitute Eq. 15.22 into Eq. 15.29 to get the result for the first-order 
transition amplitude: 


oo 

= V ^t J d£2F(tt)ex p 

—OO 


4 Equation 15.29 obviously implies that time dependence of the perturbation potential appears as a 
product of a time-dependent function and a time-independent operator, which is the case in many 
practically important situations. 
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Note that since Eq. 15.29 does not formally contain exp (iQt) terms which appeared 
in Eq. 15.16 in order to maintain the Hermitian nature of the perturbation, I can 
use Eq. 15.22 omitting the second nominally emission-related term, appearing in 
the treatment of the monochromatic perturbation. Because the integration over Q 
includes both positive and negative values, this term is automatically included in 
Eq. 15.30. The hermiticity of the perturbation is now ensured by the condition 
E(£2) = E*(—£2), which guarantees that f(t ) is a real-valued function. This 
condition is a known result from the theory of Fourier transforms, but you can easily 
verify it yourself. 

The corresponding expression for the transition probability 
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(15.31) 


features the integration of the sine function over the external frequency, which I 
sought to find, but I got more than I have bargained for—two integrals instead of 
one and two external frequencies (I had to introduce two separate integration 
variables to turn a product of two integrals into a double integral). To move forward 
with this expression, I have to make additional assumptions about the perturbation, 
and I will begin by presenting F(Q) in the amplitude-phase form 


F(£2) = |f(£2)|e ,W2) 


where I introduced its absolute value | F(Q) | and phase cp (£2). You can think of 
Eq. 15.29 as a superposition of monochromatic waves each with its own phase, and 
as your experience with wave superposition might tell you, the result depends a lot 
on the relation between the phases of the waves being added. In a simplest case, 
we distinguish between two extremes—the coherent and incoherent superposition. 
In the former case, the relation between the phases is fixed and the intensity 
of the resultant wave produces a characteristic interference pattern. Incoherent 
superposition occurs when the phases of the individual waves are not fixed and 
change randomly so that all interference terms dependent on the phase difference 
average out to zero. The resulting intensity becomes merely the sum of intensities 
of the individual waves with all traces of interference pattern washed out. 

With this in mind, I am going to suppose that Eq. 15.29 represents an incoherent 
superposition in which the phase difference cp (Q i) — cp(£2 2 ) is a random quantity 
fluctuating between values of 0 and 2 jt. This quantity appears in Eq. 15.31 in 
the form exp [cp (£2 1 ) — (p(£2 2 )], which vanishes upon “averaging” over the phases 
unless = Q 2 . If we dealt with a sum over discrete frequencies, I would 
have assigned the value of unity to this expression at = Q 2 , but since I am 
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dealing with an integral over continuously varying quantity, it is more appropriate 
to prescribe that 


exp [cp (£2i) — (p(& 2 )\ = <5 (^i — ^ 2 ), (15.32) 


where the line above the exponential signifies “averaging” over the phases. Substi¬ 
tuting this expression into Eq. 15.31, you end up with the following: 

2 00 

_ I ^ mmo | 2 /' 1 s-\ I t-i s (~\ \ 1 2 • 2 / v ^ mmo ) ^ j /icoox 

Pmmo = t / dC2\F(Q)\ sine I --- I. (15.33) 

—00 

I will not be surprised if you are not completely satisfied with my “derivation” of 
the last expression and will think of it as a bit of a hand-waving type. I would agree 
as this can hardly be called a derivation at all: the nature of the averaging procedure 
was not explained, and Eq. 15.32 was essentially postulated not derived. However, 
the result, Eq. 15.33, looks quite reasonable and in sync with our intuition about 
incoherent superpositions. Indeed, the story, which this expression tells, makes total 
sense: each frequency component of the perturbation potential generates a transition 
probability as given by Eq. 15.23 corrected for the spectral intensity | E(£2)| 2 . The 
total probability due to all frequency components is just a sum (well, in this case, it is 
an integral) of the probabilities caused by each spectral component independently. 
The incoherency supposition is in this context equivalent to the assumption that 
each spectral component of the perturbation induces transitions independently of 
the others. Of course, this result can be derived and formulated in a more rigorous 
manner, but I hope you will forgive me for the decision to spare you the technical 
details. Those who are not still completely satisfied can google the Wiener-Khinchin 
theorem and enjoy the ride. In the meantime, I will replace the sine function by a 
delta-function following the rule established above and transforming Eq. 15.33 into 
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Equation 15.34 and the corresponding formula for the transmission rate 
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(15.35) 


unlike Eq. 15.28, do make total sense and can be put to immediate use. Equa¬ 
tion 15.35 can be considered as Fermi’s golden rule formulated for an incoherent 
perturbation, where the transition rate is determined by the spectral intensity of the 
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perturbation at the transition frequency. You shall also notice that these equations 
describe both absorption and emission transitions as exemplified by my use of 
\co mmo \. Indeed, if co mmo > 0 (absorption), the zero of the argument of the delta- 
function in Eq. 15.34 occurs at positive values of £2, while if co mmo < 0, it 
happens at negative £2, which are also present in the respective integral. However, 
property F(Q) = F*(—Q) required to keep perturbation Hermitian also yields 
F(— Q)F*(— £2) = E*(£2)E(£2) ensuring that the spectral density is an even 
function of the frequency | F(— £2)| 2 = | E(£2)| 2 . Therefore, the use of the absolute 
value symbol in its argument is not really necessary as transition rates for emission 
and absorption between the same pairs of states are obviously equal to each other. 


15.2.3 Transitions to a Continuum Spectrum 

Fermi’s golden rule in its delta-functional form can also be used in a meaningful 
way when either the final or initial (or both) state participating in the transition 
belongs to a continuous spectrum. Such situation arises, for instance, in the case of 
ionizing transitions, when an electron is kicked out from its initial discrete energy 
level to the spectral continuum, where its wave function is no longer localized. 
We call this process ionization because the electron in such a final state has a 
non-zero probability to get infinitely far away from the remaining ion, so it does 
not “belong” to it anymore. Optical (caused by interaction with light) transitions 
in semiconductors present an example when both initial and final states of the 
transition belong to a continuous spectrum. In this section I will consider the 
transitions between discrete and continuous states, leaving the semiconductor case 
for a special treatment in the section on light absorption. 

When dealing with transitions to the states of the continuous spectrum, the first 
thing you need to realize is that a probability of transition to a state with an exact 
energy eigenvalue does not make sense anymore, and you need to discuss this 
situation in terms of probability density. The actual transition probability can only 
be then defined as an integral of the probability density over some finite energy 
interval. This is nothing new, of course, as we have already discussed the conversion 
of probabilities to probability densities in Sects. 2.3 and 3.3.5 when exploring 
differences between observables with discrete and continuous eigenvalues. 

So, my mission now is to turn transition probability p mm into an appropriate 
probability density, and to accomplish this, I will rely upon the old trick of forced 
discretization, which I have used in Sect. 11.5 on a system of non-interacting elec¬ 
trons. However, the concrete realization of the discretization procedure depends on 
how the states of the continuous spectrum are described. I will consider two different 
ways to deal with them. If, as often is the case, the unperturbed Hamiltonian Hq 

commutes with the operators of the angular momentum L and L z , its eigenvectors 
can be characterized by azimuthal and magnetic quantum numbers L and M even 
if the eigenvector belongs to a continuous spectrum. These states, then, are still 
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characterized by three discrete indexes (including spin) and a single continuously 
changing parameter k = V 2m e E/fi. This parameter appears in the radial equation 
for the electron in a central potential in the form of dimensionless coordinate p = kr 
in Eq. 8.12, Chap. 8. You can easily demonstrate it by replacing the dimensionless 
parameters ^ and € n j appearing in that equation by their expressions in terms of 
real physical coordinate r and energy E n j. In order to discretize this parameter, I 
have to impose boundary conditions on the radial function Ri(kr ): RiikD) = 0, 
where D —>► oo. Sparing you the details of the derivation, it can be shown that this 
boundary condition results in the following allowed values of kf. kj = nj/D + 8i/D, 
where <5/ is an extra phase dependent on the azimuthal quantum number /, which you 
do not have to worry about, and j = 1,2, 3 • • •. Now the expression for the transition 
probability p mom can be rewritten as 

I i2 

_ \v mo;m ^kj\ , i j- \ 

Pmo’,m',kj — ^ td ^ i J , 

where index mo includes principal, azimuthal, magnetic, and spin quantum numbers 
of the initial states, while the principal number in m' is replaced by a wave number 
kj. To turn this probability into the probability density, I will perform the following 
standard series of steps: 
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The first line in this expression is obtained by multiplying the initial transition 
probability by Ay, which is equal to unity, so this step does not really change 
anything. In the second line, I used the quantization rule obtained from the boundary 
conditions (note: the phase 8j, which I asked you not to worry about, disappeared) 
to replace Ay with A k and upgraded k from a discrete lower index to the continuous 
argument of the matrix element. Finally I converted A k into A E using the relation 
k = V 2m e E/fi to compute the derivative dk/dE = ^Jm e / (2 Eh 2 ). The entire 
expression in front of dE is the probability density, and the actual probability to 
undergo a transition to a state with any energy within a certain interval would be 
normally given by an integral over this interval. However, the presence of the delta- 
function in the probability density makes transition possible only to the states with 


Efr — fi^2 E m Q , 
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with the corresponding probability given by the integral over arbitrary energy 
interval as long as it includes E tr . Thus, the rate of transitions from the initial state 
|mo) to the final state | m', E tr ) is equal to 



(15.36) 


If the perturbation potential does not depend on the spin of the electron, the 
perturbation matrix element is diagonal with respect to the spin quantum numbers 
of the initial and final states. If, also, you do not really care about the spin state of 
your electron, the transition rate needs to be summed over the spin states, which, 
in this case, produces simply an extra factor of two in the transition rate. Then the 
spin-indifferent transition rate becomes 



where the indexes m' and mo are now missing the spin quantum numbers. 

In the approach leading to Eq. 15.36, the wave functions of the position rep¬ 
resentation were defined using the spherical coordinates. While this approach is 
useful in some circumstances, the wave functions even of a free particle in the 
spherical coordinates have a pretty complicated form. Thus, in many practically 
relevant situations, it is more convenient to use Cartesian coordinates instead. In 
this case the states of the continuous spectrum can be characterized by a regular 
wave vector k with the energy E (i k ) defined in a standard free particle-like way. If 
we are dealing with an electron moving in a Coulomb potential, like in a hydrogen 
atom problem, the respective wave functions (remember, I am talking about the 
continuous spectrum here, and we did not deal with it in Chap. 8) are called 
Coulomb wave functions, and they are too mathematically complex to be bothered 
with here. These functions look like modified plane waves and asymptotically 
approach the latter as the energy and/or the distance from the nucleus grows. And 
luckily enough, large distance asymptotic behavior is all I need in order to introduce 
a quasi-continuous discrete description of these functions. To this end I can go back 
to the same periodic boundary conditions used in Sect. 11.5 and introduce quantized 
components of the wave vector as k mi 2 3 = Ijrmixi/L with the set of numbers mi 2,3 
comprising the composite index m in the expressions for the probability distribution. 
Then I can convert the transition probability into a transition probability distribution 
using the same trick as before by introducing 



1 = Ami Am2 Am3 = 
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into 


Ap mom = p mm AmiAm 2 Am 3 = -—-j p mo , s ( k ) Ak x Ak y Ak z , (15.37) 

(2n) 

where in the last expression I replaced the discrete index m with the set of 
quasi-continuous variables k t while keeping the quantum spin number s (the only 
remaining discrete characteristics of the final state) as an index. Incorporating 
Eq. 15.27 I can transform it into the following expression for the transition 
probability density: 


dp mo , s (k u k 2 ,k 3 ) = ^-t L |«,no,s (k )| 2 8 (tiQ. - E ktS + E m ) d 3 k. (15.38) 
n (2tt) 

The delta-function in this expression limits the non-zero values of the probability 
density to what is called the “equienergetic” surface E^ s = tiQ + E m . A further 
development of this expression depends on the shape of this surface. I will only 
consider here the simplest and, therefore, the most popular case of spherical 
equienergetic surfaces, which correspond to the free-particle relation between 
energy and the wave vector: 


In this case, it is convenient to replace the Cartesian components of the wave 
vector k x ,k y , and k z with their spherical counterparts k, 0, and (p, where k = 

^jkl + k^ + k% and angles 6 and cp define the direction of the wave vector with 
respect to some chosen coordinate axes. Transforming the Cartesian volume element 
dk x dk y dk z (i.e., the volume in the space of wave vectors) into the spherical one 

dk x dk y dk z = k 2 sin OdOdcpdk, 


I can rewrite Eq. 15.38 as 


dp mo ,s 


2 7T L 3 


| v mo , s (k, 0, (p) | 2 8 (fiQ — Ek, s + E m ) k 2 sin OdcpdOdk. 


Using the relation between k and energy E , it is more convenient to rewrite the 
latest expression in terms of probability density for the latter. Assuming that energy 
of the particle in the continuous s pectrum does not depend on spin and using k 2 = 
2 m e E/fi 2 and dk = jm e / (2 fi 2 E)dE, you can obtain 
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dPmo,s 


2 7T L 3 \flmj 1 
fi 1 (27 r) 3 fc 3 


| Vmo,s 


(k, 9, <p) | 2 x 


<5 (/z£2 — E + £ mo ) sin OdcpdOdE. 


Further proceeding follows almost the same line of arguments, as those presented 
when deriving Eq. 15.36 with one exception. In Eq. 15.36 there was a single 
continuously distributed variable (E to be sure), so once I integrated it out, the 
remaining expression became the actual probability for the remaining discrete 
characteristics of the final state m '. Here E is just one of the three continuous 
parameters defining the final state, and while the delta-function again takes care 
of E making its value definite and equal to E tr , after integrating it out, you still 
remain with a differential probability dp m{hS (E tr ) defining the probability of the 
electron’s after-transition wave number to lie within an infinitesimally small solid 
angle d£2 = sin OdcpdO around the direction specified by angles 9 and cp. Now 
defining the corresponding transition rate density R mo , s ( 9 , cp) as a probability of 
transition per unit time per unit solid angle 


Rmo,s (9, ty) 


dpmo,s (J^tr) 

td£2 


I find for it 

2ix L 3 \pl mj' 1 r — 9 

R mo A0,(p) = -r ~ 3 -3 VK\v m AEtr,0,(p)\ 2 . (15.39) 

n (2 n) h 

This result describes spin and angular distribution of electrons kicked out from 
an atom as the result of ionization for various orientations of the electron spin. 
Experimentally, this quantity can be measured by placing spin-sensitive electron 
detectors at different angles and counting the number of electrons ejected under the 
action of a perturbation per unit time by an ensemble of non-interacting atoms in 
different directions. If the presence of the quantization volume L 3 (which is a pretty 
arbitrary quantity) in this expression gives you jitters, fear not. It will be canceled 
by the L -3 / 2 normalization factor in the wave function of the continuous spectrum 
when you compute the perturbation matrix element v m0fS ( E tr , 9, (p). 

Now, if you are not interested in the direction of the propagation of the electron 
after ionization or its spin state, and only want to know the total transition rate R mo 
from the initial state | a m ) (you can call it the rate of ionization), you will integrate 
Rm 0 ,s (9 , (p) over the entire solid angle and sum it up over the spin states: 


R 


mo 


2 n L 3 AlmJ 2 
ti ( 2 jt ) 3 h 3 


71 



d9 sin0x 


2tt 

I d <p( I ^mo,t (E tr , 9, <p)\ + | 

V mo,-l (E tr ,6,<p) | 2 ). 

0 
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If the perturbation matrix element does not depend on the angles and the spin, then 
the integration over angles combined with the summation over spin yields the trivial 
factor 8tt so that the expression for the ionization rate can be written down as 


AJL o 

Rmo = —j— \Vmo (Etr) \ 8 ( E mQ + flQ) , 


(15.40) 


where g ( E ) is the same density of states for a free electron, which was defined in 
Eq. 11.56. Equation 15.40 is often passed for one of the reincarnation of Fermi’s 
golden rule, but one would be wise to remember about the limitations involved in its 
derivation: angle and spin independence of the perturbation matrix element. I will 
return to this issue in the section on light absorption and emission. 

I will complete this section by an example of calculations based on Eq. 15.39. 

Example 31 (Ionization of a Hydrogen Atom) Find the angular distribution of the 
electrons ejected from the ground state of hydrogen atom by a uniform oscillating 
electric field £ = £ 0 cos Qt, assuming that the frequency £2 satisfies the condition 
fiQ 13.6 eV. This condition ensures that the final state of the electron, indeed, 
belongs to the continuum spectrum and that its energy is high enough to approximate 
the respective wave functions by a plane wave. 

Solution 

The perturbation operator corresponding to the uniform electric field can be written 
down as 


V = er • £(t) = er £ 0 cos £2t, 


(15.41) 


where e is the absolute value of the electron’s charge. A comparison with Eq. 15.16 
allows me to identify the matrix elements v mm as 



(15.42) 


where I substituted the wave function of the ground state of the hydrogen atom from 
Chap. 8, which represents the initial state, and the box normalized plane wave for 
the final state from Sect. 11.5. 

The structure of the hydrogen ground state wave function, which depends only on 
the absolute value of the position vector r = \r |, indicates that you can save time and 
effort by using a spherical coordinate system to calculate this integral. However, in 
order to take advantage of this realization, you, first, have to choose the directions of 
the axes that you will use to define the spherical coordinates. Actually, there is only 
one axis which you need to worry about—the polar (Z)-axis. When making such a 
decision, it is wise to consider first if there are any constant vectors in the integrand 
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(position vector r does not count—it is an integration variable). In Eq. 15.42 there 
are two such vectors: the wave vector of the final state k and the external electric 
field Eq. These two vectors define a plane, and it is quite obvious that choosing it as 
one of the coordinate planes will simplify the task (both vectors will immediately 
lose one of their coordinates, and two coordinates to worry about are always better 
than three, right?). Now, the last question is: which of these two vectors choose to 
define the direction of the polar axis? My bet would be on k since this choice, while 
making r • £q more complex, would simplify expression k • r , and it is better to 
have a more complex expression as a factor than as an argument of the exponential 
function. Thus, settling on the direction of the Z-axis along k , and choosing £ to 
belong to X — Z plane, I can immediately express k • r and r£ o in the corresponding 
spherical coordinates r, 9, and cp. With this choice of coordinate axes, I have 

k r = kz = kr cos 9 


and 


r£o = x£q sin 0 E + z£q cos 6 e = r£ 0 (sin 0 cos (p sin 0 E + cos 0 cos 0 E ) , 

where 0 E is the angle between the vector of the electric field and the wave vector k 
(see Fig. 15.3). Finally, introducing the volume element in the spherical coordinates, 
I have the following for the matrix element: 


1 sin 9 x 


I - CXJ 71 

uf dr? f di>si 
T 0 0 

2ti 

J dcpe~ r ^ aB r (sin 9 cos cp sin 9 E + cos 9 cos 9 E ) e l 


ikr cos 0 


(15.43) 


Fig. 15.3 The choice of the 
coordinate system for 
calculations of the integral in 
Eq. 15.42 



X 
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First, note that the first term in the integral contains the factor cos ip, which after 
integration with respect to the azimuthal angle ip yields zero: fp dip cos cp = 0. 
The remaining integral 

I - OO 7 T 2n 

^m 0 ,s(S E ) = ^ J—j drr 3 e~ r ^ aB j dO sin 6 cos Q e lkrcosd j dip , 

' B o o o 


where I replaced argument k with 0 E reflecting that this is the only characteristic of 
k , which this matrix element depends upon, takes a bit more work. The integration 
with respect to ip is, of course, trivial and produces 2tt, while the integral with 
respect to the polar angle requires the standard substitution of the variable s = cos 6 , 
which yields 


7T 

/ 


(Wsintfcostfe'^ 08 * = / dsse' 


Jkr _i_ fj—ikr Jkr „—ikr 

ikrs _ e e e e 


ikr 


-l 


lr{ l ~kkr)^ + lr{ l + lr ) 1 

The remaining integral over r is the sum of two integrals: 
e£on cos 9 e 


+ 


—ikr 


Vmo,si.9 E ) — 


ik 


I i 

jta B L 3 


/M'-s) 

_o 


e ~r(\/a B -ik) 


oo 


-r{\/a B +ik) 


l6ie£oTt cos 6 E 


1 


a B k 


1 6iJ ^y-e£oa 3 B cos 0 E 


7ta 3 B L 3 (i + a 2 B k 2 ) 
k 


(l + a B k 2 ) 


which I, to be honest, computed using Mathematica©. Now, this matrix element 
must go to Eq. 15.39, where it yields (taking into account the summation over spin) 


R mo (Qe) 


L 3 V2r 


3/2 


4tt 2 ft 4 


- ypEtr 


256k 2 


(l + a 2 B k 2 ) 


L 3 


B ~ 2 £q cos 2 6 t7 


64 (e£oas) 2 2 m e a B 
nh h 2 


cos 2 9 e 


{ktrClg) 

(1 + klalf 


(15.44) 
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where k tr = +j2m e E tr /fi. Substituting the expression for the Bohr radius from 
Eq. 8.8 into 2m e a 2 B / : h 2 , you will immediately see that this factor is just 1/ \E g \, 
where E g is the ground state energy of hydrogen, which in this example plays the 
role of E mo —the energy of the initial state. Thus, the expression for R mo (0 E ) can 
now be rewritten as 


Rm {) (Oe) = 


64 (< e£o<2 B ) 2 2 n 

-i—i— cos 0 E - t. 

* &KI (1 + 44) 6 


The angular dependence of this transition rate is given by the factor cos 0j, which 
indicates that the majority of the electrons propagate after the ionization in the 
direction of the field (not a big surprise here, of course), while the ionization 
probability in the direction perpendicular to the field is zero. To get the complete 
ionization rate regardless of the direction, one needs to integrate this result with 
respect to the angles 0 E and cp E , remembering, of course, that the integration occurs 
over a solid angle sin OdOdcp. This involves computing the following expression: 


2tt n 

j dipE f dO E sin 0 E cos 2 0 E 


= 2„f 


, 2 4n 
dss = —, 

3 


o o 

so that the total ionization rate becomes 


Rrr 


256 (e£oa B ) 2 ( k tr a B ) 3 


3 (1 + 44 ) 

The dimensionless parameter k tr a B can be written down in the form 


(15.45) 


ktr^B — 


^2m e (fi£l + E g ) < 


jfi^l E g 1 

'ha 


\ E s\ 


W\ 


(15.46) 


which reveals that the ionization is only possible if ti£2 > \E g , where this parameter 
remains real. The dependence of the ionization rate upon frequency is determined 
by the factor 


n _ (ktrM b) 

“(i + 44) 6 ' 

Plotting as a function of the dimensionless parameter ka B , you will notice that this 
factor exhibits two different modes of behavior: growth as (k tr a B ) 3 for k tr a B <$C 1 
and a much faster decrease as (k tr a B )~ 9 for k tr a B 1 (see Fig. 15.4). (Actually, you 
could have noticed this without any plots, by simply analyzing this function in two 
different limits, but it is always nice to have something to look at.) The transition 
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Fig. 15.4 Dependence of the 
ionization rate on the 
dimensionless parameter ka B 



between these two regimes occurs at k tr aB = 1/ \/3, where the function /3(k) has a 
maximum. This corresponds to the external frequency £2 max '- 

4 \Eo\ 

n ^ L max — ^ • 

This condition means that the maximum efficiency of ionization occurs at such 
frequency of oscillations, which excite electrons to the energy equal in magnitude 
to the third of the energy E tr = \E g | /3. 


15.3 Semiclassical Theory of Absorption and Emission 
of Light 

15.3.1 Interaction of Charged Particles with Electromagnetic 
Radiation 

In the introductory physics courses, you learned that a particle with charge q moving 
with instantaneous velocity v in electric and magnetic fields E and B experiences a 
force described by the well-familiar Lorentz expression (in SI units): 

F = qE + q vx B. 

Unfortunately, this simple equation is ill-suited for application in quantum mechan¬ 
ics, which is based on the Hamiltonian representation of all interactions. Fortunately, 
it is possible to describe the interaction between a charged particle and electromag¬ 
netic field remaining firmly grounded in the Hamiltonian approach. However, to get 
there I have to remind you a few facts from electromagnetism first. 
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You know that in the electrostatics and magnetostatics, when an electric field can 
only be created by charges, and a magnetic field arises only in the presence of an 
electric current, both fields can be expressed in terms of their respective potential 
functions. The electric field is related to a scalar potential V(r ) as £ = — W(r), 
while the magnetic field can be found from a vector potential A (r) as B = V x A, 
where V is a well-known vector Del or nabla operator. Both these potentials are not 
defined uniquely: you can add any constant term to a scalar potential and any vector 
function expressible as a gradient to the vector potential without changing the fields 
(the last statement follows from the operator identity V x V/ = 0). 

When electric and magnetic fields are allowed to change with time, you are 
moving into the realm of electrodynamics, where everything gets intertwined and 
the electric field can be generated by a time-dependent magnetic field and vice 
versa. One of the immediate consequences of this is a possibility for electric and 
magnetic fields to sustain each other without the need for any charges or currents. 
Do not get me wrong here, though: some charges or currents are needed someplace 
to generate the fields, but the point is that these fields can break out free of their 
initial sources and travel through space without any charges or currents in the form 
of electromagnetic waves. Of course, I am sure you know all this, but it is still 
useful to recall these simple facts to create a proper context for what follows. 
While the transition to electrodynamics does not change the relation between the 
magnetic field and the vector potential, the electric field now acquires an additional 
contribution in the form 


£ = -VV (r)-dA/dt. 

The appearance of the vector potential in the expression for the electric field reflects 
the possibility for the generation of this field by the time-dependent magnetic field 
(this explains the time derivative). In the region free of charges, the scalar potential 
term can be eliminated using the non-uniqueness of the potentials, so that both 
electric and magnetic fields are expressed in terms of a single vector potential: 

£ = - dA/dt ; B = V x A, (15.47) 

which can also be made to satisfy the condition V • A = 0. This condition is 
called gauge-fixing condition, and the potential defined this way is said to be in 
the Coulomb or transverse gauge. 

Now I can approach the main issue of concern—the Hamiltonian formulation 
of the particle-field interaction in classical physics. It turns out that the classical 
Lorentz force on a particle interacting with the electromagnetic field in the absence 
of charges and currents can be reproduced from the Hamiltonian equations if the 
Hamiltonian of the charged particle is written down as 

| p-qMtyty^ _ E/ (pj - <J A j) (Pi - vX) 


2 m t 


2 


(15.48) 
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where in the last expression I used the definition of the magnitude of a vector in 
terms of its Cartesian components. To verify this fact I just need to write down the 
classical Hamiltonian equations as given by Eqs. 7.8 and 7.7: 


dp, 

dt 


dr t 3 H pi — qAi 

dt dpi m e 


dH 
3 ri 




(15.49) 

(15.50) 


Differentiating the first of these equations with respect to time and using the time 
derivative of the momentum from the second equation, I obtain 


dt 2 


m e dt m e m e ^ v 7 


3A; 
3 Yi 


Multiplying both sides of this equation by m e and noting that the velocity of the 
particle v t = dr t /dt = (p t — qAi) /m e ,\ can rewrite this as 


m e 


d 2 r t 

~dfi 



+ 4^2 

j 



(15.51) 


Now you need to pay attention. The time derivative of the vector potential appearing 
in Eq. 15.51 is a full derivative (straight d) as opposed to the partial derivative (round 
3) appearing in Eq. 15.47. The difference is significant because the vector potential 
is a function of the coordinate of the particle, and it changes with time for two 
reasons: one is its explicit time dependence, and the other is due to the motion of 
the particle. In other words, when computing this time derivative, you must treat the 
position vector rasa function of time. Accordingly, you have to write 

dAi _ 3 Ai 3 Ai drj _ 3A f 3A; 

dt 3 1 + “ dr; dt dt + ^ Vj drj ’ 

j J j J 


where I introduced the particle’s velocity Vj = drj/dt. Taking this into account, 
Eq. 15.51 takes the following form: 



(15.52) 
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Comparing this with Eq. 15.47, you can immediately identify the first term on the 
right as q£, which is the electric field component of the Lorentz force. In order to 
prove that the second term reproduces the contribution of the magnetic field to the 
force, note that using identity Ax(D x C) = D (A C) — (A D) C, I can transform 
vxB as 


vxB = v x (VxA) = V (v • A ) — (rV) - A. 


Rewriting this expression in the component form 


3 v-^ 3 A t 

(vxB) i = = 



you see that it exactly coincides with the second term in Eq. 15.52. Now you can 
take a breath: we have shown that the Hamiltonian given by Eq. 15.48 indeed 
correctly reproduces the equation of motion for a charged particle in a field of 
an electromagnetic wave. If, in addition to the electromagnetic field, the particle 
also has a regular potential energy of interaction, for instance, with another 
charged particle, the total quantum Hamiltonian capable of handling this situation 
becomes 


H = 


Ip~<?-A(r, ?)] 2 

2 m e 


+ V(r), 


(15.53) 


where the classical coordinate and momentum are replaced by the corresponding 
operators, while the vector potential remains a regular classical quantity. In many 
problems involving the interaction between quantum systems and radiation, the term 
proportional to the square of the vector potential can be neglected, in which case the 
Hamiltonian can be presented as 


H = ^ -f t— (pA(r, t) +A(r, t)p ) + V(r), (15.54) 

2 m e 2m e 

where I replaced the generic charge q with the negative electron charge —e, but, 
most importantly, restrained from equating expressions pA(r, t) and A(r, t)p, which, 
while natural in classical description, would be in general wrong in the quantum case 
due to non-commutativity of momentum and position operators. However, using the 
gauge-fixing condition for the vector potential, the Hamiltonian 15.54 can still be 
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simplified. Indeed, consider the expression pA(r,t)f(r ), where/(r) is an arbitrary 
function, using the position representation for the momentum and coordinate 
operators: 


pA(r , t)f(r) = -itiV [A(r, t)f(r)] = 

—itiA(r, f)V [/(r)] - itif(r)V [A(r, t )]. 

In the Coulomb gauge, the last term in the above expression disappears, allowing 
me to replace pA(r , t) with A (r, t)p and present the Hamiltonian as 


H = ^-b — A(r, t)p + V(r). (15.55) 

2 m e m e 

Assuming that the vector potential A (r, t) describes a plane linearly polarized 
electromagnetic wave, I can choose it to have the following form: 

Sn 

A (r, t) = —w — cos ( k r ~ , (15.56) 

where w is a unit vector defining the polarization of the wave. To satisfy the 
condition V • A = 0, w must be perpendicular to the propagation direction of the 
wave k. According to Eq. 15.47, the chosen form of the vector potential describes a 
wave with the electric field: 


S = Sow sin (At — £2t ). 


(15.57) 


With the vector potential in the form of Eq. 15.56, the perturbation operator becomes 

v = -^\ 

2 m e Q L 

My goal in this section is to apply the perturbation theory for the time-dependent 
Hamiltonians to optical transitions between discrete states of a hydrogen-like atom 
due to an incoherent radiation. This job can be made significantly easier if I take 
into account that the characteristic scale of the wave functions, which I will be 
using to compute the perturbation matrix elements, is determined by the Bohr radius 
as, which for a hydrogen atom in vacuum is about 5.3 x 10 -2 nm. According to 
Eq. 8.8, it might become two or three orders of magnitudes larger for excitons in 
semiconductors, but even in this case, it is a far cry from the wavelength A of visible 
radiation, which is of the order of 400-700 nm. This means that over the distances, 
when the wave functions in the perturbation matrix elements have any substantial 
magnitude, the field of the electromagnetic wave barely changes. Consequently I can 


e i(kr-Qt) + e -i(kr-Q. 0j {w , ~ } _ (15.5 8 ) 
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safely replace exp (d zikr) with unity, which is just the first term in the expansion of 
this exponent with respect to a small parameter A; • r ~ 27r<2 fi /A 1 making the 

perturbation operator much simpler: 

V = —— — \e iQl + e~ iQ, ](w-p). (15.59) 

2 m e £2 L J 

This approximation is called a dipole approximation for reasons which will become 
clear later. Comparing Eq. 15.59 with Eq. 15.16, I can identify the matrix element 
v mom appearing in all formulas for the transition probability as 

e So 

Vmom = ~2^n (}V ’ (15 ‘ 60) 

where 

( w -p)m 0 m = (u mo \w-p\a m ) (15.61) 

is the matrix element of the component of the momentum operator in the direction 
of polarization of the incident light. 

The computation of this matrix element in the position representation would 
involve differentiating the wave functions of the final state, which is not a terribly 
difficult thing to do but is better avoided if possible. And it can be, indeed, 
avoided with the help of one of the magic tricks from the arsenal of quantum 
mechanics. Recall a commutator of the unperturbed Hamiltonian with the position 
operator, which I computed in the section on Ehrenfest theorem, Eq. 4.18. Trivial 
generalization of that result to all three components of the position vector allows to 
present it as 


g, A] = ifip/m e . 

Using this result to replace the momentum operator in Eq. 15.61, 1 can turn it into 

(w ■P)mm = -jr (“mo I W ■ (rHo - H (j r J \a m ) = 

171 /V Jfl /V 

TT (“mo I w ■ rHo | a m ) - Tr (“mo I H 0 W ■ r l“m) • 
in in 

Acting with the Hamiltonian to the right in the first term, and to the left in the second 
one, I can present this expression as 

171 

= .. {Em E m o) (otffiQ | w • v |o? m ), 


(w p), 
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reducing the matrix element of the momentum operator to that of the position 
operator. Substituting this result into Eq. 15.60, 1 find 



(15.62) 


It will be useful to recall now that the transition probability, which I intend to 
calculate, contains the delta-function S (fiQ — E m + E m0 ), which ensures that for 
any Q, only those transitions have a non-zero chance to occur for which E m —E m = 
fiQ. This observation allows further simplification of Eq. 15.62 by canceling the 
energy difference E m — E m in its numerator and the frequency of the perturbation 
£2 in the denominator. This yields for the matrix element: 



(15.63) 


The reduction of the momentum matrix element to this form is, of course, very 
useful technically, but it is also quite remarkable from a more philosophical, if you 
want, point of view. To completely appreciate a significance of this result, take a 
look at the expression for the electric field corresponding to the chosen form of the 
vector potential, Eq. 15.57. If you were to neglect the k • r in it, you would end 
up with a spatially uniform electric field, which can be associated with a potential 
energy: 


V = -eE ■ r = U ’ E ^ ' - (e~ int - e ,Qf ) 


(15.64) 


and which would generate the time-independent perturbation matrix element v mom 
of exactly the form given in Eq. 15.63. (Note that v mm is defined as a coefficient in 
front of exp (— iQt) term.) Now, think about it: I started out with the perturbation 
expressed in terms of a vector potential introduced to describe a solenoidal (with 
zero divergence) electric field, which can only be created by a changing magnetic 
field, and ended up with the perturbation expressed in terms of a scalar potential 
reserved for the potential electrostatic field. It might appear that physics works 
in mysterious ways, but the only mystery here is the beautiful interconnectedness 
and self-consistency of the various parts of the machinery of electrodynamics and 
quantum mechanics derived from the unified structure of the world. 

A more practically minded person could say, of course, that since this result 
is only valid in the approximation k • k 1 , it merely means that locally (over 
distances much smaller than the wavelength of the electromagnetic wave) there is 
not much difference between potential electric fields created by electrical charges 
and solenoidal fields arising due to the Faraday effect. 

Now, if you recall that I am talking here about an electron interacting with an 
equal positive electric charge chosen to serve as an origin of the coordinate system, 
you can easily see that expression er represents a dipole moment of this system of 
charges and that the perturbation potential is just the energy of a dipole in a uniform 
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electric field. This is the reason why we use the term “dipole approximation” 
to characterize the reduction of the perturbation matrix elements to the form of 
Eq. 15.63. 

Having derived the expression for the matrix element v mom , I can consider a few 
particular examples of interaction between light and matter. 


15.3.2 Absorption and Emission of Incoherent Radiation 
by Atoms 

15.3.2.1 Absorption and Stimulated Emission 

When discussing the interaction between light and electrons in atoms or semi¬ 
conductors, one has to consider three different processes: absorption, stimulated 
emission, and spontaneous emission. Spontaneous emission refers to the process of 
emission of light in the absence of any external electromagnetic field, which could 
“stimulate” the electron to drop from a higher energy level to a lower one. Because 
of the spontaneous emission, the eigenvectors of the electron’s Hamiltonian (with 
exception of those that correspond to the ground state) do not actually represent 
true stationary states—an electron does not stay in these states forever, eventually 
undergoing a transition to the ground state and emitting light. 

This circumstance makes it obvious that the theory of atomic energy levels and 
the stationary states developed so far is not quite complete. What is missing in this 
story is an interaction of electrons with a quantized electromagnetic field responsible 
for the phenomenon of spontaneous emission. The true stationary state must be 
formed by a combination of vectors belonging to a tensor product of electron and 
electromagnetic field vector spaces, such that the total energy corresponding to such 
a state would remain constant: a decrease of electron energy due to transition to 
the lower energy state shall be compensated by an increase of the energy of the 
electromagnetic field due to emission of light. Unfortunately, the complete theory of 
interaction between atoms and quantized electromagnetic field is outside the scope 
of this book as I already have mentioned before. 

Thus, while I wouldn’t be able to resist the temptation of demonstrating beautiful 
thermodynamic arguments of Albert Einstein that allowed him to deduce the rate 
of spontaneous emission without any knowledge of quantum electrodynamics, my 
main focus in this chapter would be on absorption and stimulated emission in the 
presence of classical incoherent radiation. 

Within the quantum framework, the process of stimulated emission arises 
naturally as a consequence of the hermiticity requirement of operators representing 
observables (this is the reason why I had to introduce both exp (iQt) and exp (—iQt) 
terms in Eq. 15.16). However, before the advent of the full quantum theory, the idea 
of stimulated emission appeared quite bizarre. Indeed, think about it: an electron in 
an excited state is illuminated by light, and instead of jumping even to the higher 
energy level absorbing this light, it jumps to the lower level emitting even more 
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light! The concept of stimulated emission was introduced by Einstein when he 
analyzed thermodynamic equilibrium between light and atoms—it turned out that 
without this process, the equilibrium cannot exist. 

On second thought, however, it might appear surprising that the existence of 
this process has not been foreseen earlier. After ah, remaining completely within 
the classical picture, one can think of a classical dipole interacting with radiation, 
in which two oppositely charged particles are connected by an elastic string. This 
system can both absorb and emit radiation. Indeed, on one hand, the harmonic 
oscillations excited by the incident radiation take energy away from light resulting in 
its absorption, while on the other hand, an oscillating dipole emits electromagnetic 
waves which carry away its energy damping the oscillations. This radiation is 
stimulated by the incident wave, and while quantitatively it might differ significantly 
from quantum stimulated emission, conceptually it is a similar process, and it might 
seem a bit surprising that it took that long to recognize its existence. 

Well, it does not do us any good second-guessing our forebears, so let’s go back 
to our immediate task. The transition rates for absorption and stimulated emission, 
which are equal to each other, are described by Eq. 15.35. If you think of the incident 
electric held £{t) as of incoherent “superposition” of monochromatic components 
with continuously distributed frequencies, you can identify £q with F (£2) appearing 
in Eq. 15.29 and £q with |E(£2)| 2 in Eq. 15.35. When dealing with the finite or 
discrete number of incoherent waves, you could find the total energy density of 
the held by simply adding energy densities of the individual waves. In the case of 
continuous spectral distribution, the total finite energy density of the held is divided 
between the inhnite numbers of infinitesimally small spectral intervals dfi. The 
contribution of each such interval can be described as u(£2)d£2, where u (£2) is 
the time-averaged spectral energy density of the corresponding spectral component, 
which for a wave propagating in vacuum is given by the following expression well 
known in introductory electrodynamics (I am using SI units here): 

w(£2) = ho^o (£2) • 

Taking into account Eq. 15.63, I can now present the expression for the transition 
rate as 


AJl C 2 9 

Fmom — 1^ * {P^rriQ \ ^ \ &m) \ £() (^m,mo) = 

7tu((O mmo ) i ~ i \i2 

-— —\w - {a mo \er\a m )\ . (15.65) 

son 1 

If the incident light has a dehnite polarization (noncoherent polarized light can 
be created using a polarizing element), Eq. 15.65 provides a full description for the 
corresponding transition rate. If, however, the light is not only incoherent but also 
unpolarized, some additional efforts are still required. In an unpolarized light, the 
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polarization vectors w must be considered as a random quantity, and the completion 
of the calculation of the transition rate in this situation demands the averaging of 
Eq. 15.65 over the polarization directions. 

To perform the averaging, I, first of all, must recall that an unpolarized light is a 
mixture of two independent polarizations, each appearing in the mix with equal 
probability 1/2. To simplify the arguments, I will think of them as two linear 
polarizations characterized by mutually perpendicular unit vectors W\ and W 2 with 
random but mutually correlated orientations (they must always remain orthogonal 
to each other). Therefore, I need to take the average over these two vectors and 
integrate it over their directions. 

The direction of both polarization vectors must be perpendicular to the wave 
vector A:. This requirement creates a one-to-one correspondence between W\,W 2 and 
the propagation direction of the wave. Consequently, the integration of the transition 
rate over k automatically averages it over all allowed directions of W\ and w 2 . The 
resulting polarization-averaged transition rate can be presented as 

Rm 0 m = JrM ^ m 2 ’ m °' > 2- jj) d£2 k (|wi • (a mo I er \\a m )| 2 + 

\w 2 ■ (Umo\er |a m )| 2 ) , (15.66) 

where the integration is over the surface of the constant k , and it is assumed that 
all directions of k are equally likely. The factor 1/4 tt normalizes the uniform 
distribution of k , so that (1/47 t) ff = 1. To carry out the integration, let me 
first note that \w \ t 2 • ^o)| 2 , where g(m, mo) stays for the vector dipole matrix 

element 5 


g(m,mo) = \{a m \er\a m )\, (15.67) 

does not depend on the choice of coordinate axes used to define the components 
of the vectors in this expression. Using this freedom to simplify it, I choose the 
direction of k as the Z-axis of a coordinate system. This choice makes defining 
polarization vectors m \2 almost trivial: recalling that they must be perpendicular to 
each other and to k , you can choose them to be directed along the arbitrarily chosen 
X- and F-coordinate axes of this system. Now you can immediately have 

\wi • g(m,m 0 )\ 2 + \w 2 • \ 2 = |£* (mm 0 ) | 2 + \g y (mm 0 ) | 2 = 

g 2 (mmo) sin 2 6 cos 2 (p + g 2 (mmo) sin 2 9 sin 2 (p — g 2 (mmo) sin 2 9, (15.68) 


5 Since I assumed linear polarization, vectors w \ t 2 are real, allowing me to take them outside of the 
absolute value sign. 
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where 9 is the angle between k and a vector g(m,mo), while angle <p defined the 
direction of this vector in the X—Y plane. 6 

Since the result expressed by Eq. 15.68 is presented in the form independent of 
the choice of the coordinate system (angle 6 is an objectively existing geometrical 
characteristic independent of the choice of coordinates), I could now move to 
different coordinate axes, if it suited me better. Indeed, the next step of the averaging 
procedure involves integration over directions of the wave vectors, so it does not 
make sense to use A; as a choice for the polar axis. And allowing vector g(m, mo) 
to have this honor will make the integration just a bliss. Angle 6 in Eq. 15.68 
now becomes one of the spherical coordinates of the wave vector and thereby an 
integration variable. Substituting Eq. 15.68 to the integral in Eq. 15.66, 1 get 


RniQin — 


71U (C0 m m Q )g- 2 (mm 0 ) 1 

Soft 2 8 7T 

71U {C0m,mo ) g 2 (romp) 1 

Sofl 2 8 7t 


dQv sin 2 9 = 


2n 


J d9 sin 9 J dcp sin 2 9 


7t [g (mmo)Y 
3sofi 2 


U ’ (15.69) 


where I used the spherical surface element d£2k = d(pd9 sin 9. 

This result describes the transition rate between a pair of states of a single 
atom, which can be either related to emission (if &> m . mo < 0) or absorption (if 
co m . m Q > 0) of radiation. It is obvious that since u (&> m ,m 0 ) is an even function of 
frequency (it has to be if we believe in time-reversal symmetry 7 ), the transition rates 
for both emission and absorption are equal to each other. A word of warning: the 
statement of equal transition probabilities refers to transitions involving the same 
pair of states with initial and final states reversed. If, however, you are looking at 
an “atom” in some initial state | a mo ) and are interested to know if the perturbation 
is more likely to cause upward or downward transitions, you can easily see that 
these probabilities do not have to be equal. Indeed, in this case, you are dealing 
with transitions characterized by the same initial and two different final states, so 
that they might have different transition frequencies oo mumo and co m2m) and different 
matrix elements. 


6 One needs to be a bit careful here because the original dipole matrix element vector $ ( m, mo) can 
be complex. 

7 Time-reversal symmetry means that if you change t to —t, nothing must change. Wave equations 
of electromagnetic waves are of the second order with respect to time derivative, and, therefore, 
they are obviously time-reversal symmetric. The situation is more complex with the Schrodinger 
equation, which are of the first order in time derivative, but you will have to wait for a more 
advanced course on quantum mechanics to dig into this issue. 
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In general when discussing quantum transitions and their probabilities, there is “a 
must”: you must clearly understand which phenomenon you are dealing with. This 
exhortation is especially important if one or both states connected by the transition 
are degenerate, in which case Eq. 15.69 has to be used cautiously and the validity 
of the statement about equal transition rates for reversed transitions depends upon a 
situation you are dealing with. 

The first question that must be asked is how much control do you have about the 
initial state, meaning do you know in which of the degenerate states with a given 
energy your system is? If the answer is “yes,” then you can proceed to the next ques¬ 
tion, but if the answer is “no,” you have to assume that all of the degenerate states are 
possible and, in the absence of any reasons to give preference to one state or another, 
that they can occur with equal probabilities. In this case the transition rate given by 
Eq. 15.69 must be averaged over the initial states. The next question you must ask 
is: what do you know (or want to know) about the final state? For instance, if you are 
only concerned with the total absorption or stimulated emission rate in the presence 
of unpolarized and incoherent radiation (strictly speaking, this is the only situation 
for which Eq. 15.69 is good for), then you have to sum up the rate given by this 
equation over all degenerate final states. In this case the total transition rate between 
two energy levels (note the careful choice of the language: now I am talking about 
the transition between the energy levels rather than between states) can be written 
down as 



(15.70) 


where g mo is the degree of degeneracy of the initial energy level reflecting the 
procedure of averaging over the initial states. In the case of the reversed transition, 
the similar expression takes the form of 



(15.71) 


where g m is the degeneracy of |or m ), which is now the initial state. Since generally 
grn ± gmo- the two total energy-level-to-energy-level rates R Emo ,E m and R Em ,E mo are 
not equal to each other even if particular state-to-state rates R mm are. 

The situation might change if you are interested in the absorption or emission of 
light of a particular polarization (experimentally you can express your interest by 
inserting a polarizer between the light source and the atom). In this case you, first, 
must forgo the averaging over polarizations, go back to Eq. 15.65, and determine 
which of the perturbation matrix elements are different from zeroes for each of 
the degenerate state. If the polarization allows you to initiate a transition to just 
one particular state, you can congratulate yourself: you found a way to optically 
control a quantum state of your system. If you end up with more than one “allowed” 
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transition (the one for which the perturbation matrix element does not vanish), you 
still might have to average over the initial states and sum up over the final states, 
however, including only the states connected by allowed transitions. 

To prevent your head from spinning too much, I will illustrate these points with 
an example. Let | a m ) be the ground state of the electron in a hydrogen atom, which 
is twice degenerate with respect to the spin magnetic number m s , and assume that 
\ot m ) is its first excited state, which is eightfold degenerate. The composite indexes 
mo and m in this case consist of the principal number n, orbital angular momentum 
number /, and orbital and spin magnetic numbers m/ and m s . The initial state \a mo ) 
represents one of two states |1,0,0,1/2) or |1,0,0, —1/2), while \a m ) can be one 
of eight states |2,/, m/, ±1/2) with / = 0, m = 0 or / = l,m = —1,0,1. 
As usual, I am using here the standard nomenclature for the states of a hydrogen 
atom | n, l, mi, m s ), where the first is the principal quantum number, followed by the 
orbital angular momentum number, and then followed by orbital and spin magnetic 
numbers. Now, assuming that you want to compute the absorption rate for the 
unpolarized light with any of | a mo ) as the initial state and any of \a m ) as the final, 
you have to consider the following: 

Re\ ,e 2 — 2 [^t;o,of + ^t;o,of ± /fy ; o,of + + 

i 

m /=—1 

To make notations less cumbersome, I am showing here only the spin index for the 
initial state and have dropped the principal quantum number for both initial and 
final states. Factor 1/2 here is the degeneracy degree of the initial state, and the 
expression includes the summation over different values of the spin variable of the 
initial state. You also see the sum over all possible values of the quantum numbers 
characterizing the final states—this takes into account the probabilities of transitions 
to all states with the same final energy. If the perturbation potential does not depend 
on spin, this expression can be simplified since the matrix elements with different 
spin magnetic numbers are all equal to each other. In this case, you have for the total 
absorption transition rate 


R 


E i ,Ei 


= 2\R 


1,0,0;2,0,0 


E R 1 , 0 , 0 ; 2,1 ,mi 

ni =— 1 


(15.72) 


where now I suppressed the index for the spin magnetic number (because nothing 
depends on it) and restored all other indexes for both initial and final states. For the 
opposite emission transition from n = 2 energy level to the ground state, the similar 
expression for Re 2 ,e x is 
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Re 2 ,e x — ~ (^t;o,of + ;o,ot + R- 


f ;0,0f 


E + E + E + E - 


mi =—1 


=-l 


m/=—l 


=-l 


1 

4 


^l,0,0;2,0,0 + E 

m/=—] 


7? 


l,0,0;2,l,m/ 


(15.73) 


Further computation of these transition rates requires dealing with the matrix 
elements between different hydrogen-like states, and I will put it off for now to keep 
the suspense on. As a compensation, I will, instead, find for you the absorption and 
stimulated emission rates for one-dimensional harmonic oscillator with mass /x and 
charge e oscillating at frequency coo along the Z-axis (no degeneracy there!). In this 
case the dipole matrix element vector g (mmo) defined in Eq. 15.67 has just a single 
component g z (mmo) = (a mo I ez\a m ), where \a m ) = | m) now is an eigenvector of 
a harmonic oscillator corresponding to the eigenvalue E m = ficoo(m-\- 1/2). Using 
Eq. 7.48 for the coordinate matrix elements, I can write 


, , „ , , I m = m<) - 1 (emission) 

(m 0 \ez\m) = { (15.74) 

[ e v 2 m = rn 0 + 1 (absorption). 

Apparently transitions can occur from any state | m) only to adjacent states \m =b 1) 
with the transition frequency in both cases being |&> m , mo | = \coo(mo + 1/2) — 
co o (mo ± 1 + 1/2) | = coq. Then the stimulated emission and absorption rates for 
downward and upward transitions between, say, states |mo) and |mo — 1) become 


R 


mom()— 1 


= R 


mo— i mo 


ne 2 u (coo) fimo 
3 £q £mh 2 2 /jLCOo 


Tie 2 (E mo - fiaop/l) 
6so£ m fi 2 cohi 


u (coo), 


(15.75) 


where I replaced the initial state number mo with the value of the corresponding 
energy level E mo . As a side note: to find transition rates between states |mo) and 
|mo + 1), one would need to replace in Eq. 15.75 mo to mo +1, so for the given initial 
state |mo), the probability of absorption (transition to |mo + 1)) is larger than the 
probability of emission (transition to |mo — 1)) by ( ne 2 u (coo)) / (2s 0 ficoofE). This is 
a useful illustration to a general statement that equality of emission and absorption 
probabilities refers to transitions with reversed direction between the same pair of 
states. Probabilities of upward or downward transitions from a given initial state can 
be and often are different. 
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The results obtained so far describe a single quantum system (atom, oscillator, 
etc.) in a given initial state. However, in most situations, in which these results might 
be of interest, you have to deal with an ensemble of systems, not necessarily all in 
the same initial state. In the presence of a radiation field, some of these atoms would 
undergo upward transitions absorbing light; others will emit light while lowering 
their energy. What we are interested in most of the time is the net result: will there be 
more upward or downward transitions resulting in an overall absorption or emission 
of light? The answer to this question depends not only on the transition rates but 
also upon the number of atoms in different initial states. Below I am going to discuss 
this situation using a simplified model as per Einstein with radiation described as a 
collection of Einstein’s light quanta—photons. 

If the spectral energy density of the incident radiation u(co) is peaked at some 
frequency co max and falls off within some frequency range comparable to the distance 
between energy levels of an atom, I can approximate atoms by two-level systems 
with only two energy levels. Let’s assume that per unit volume there is a stationary 
(independent on time) number N a of atoms in the lower energy state | a) with 
respective energy eigenvalue E a and N b atoms in the higher energy state | b) with 
corresponding eigenvalue Eb > E a such that E b — E a m fico ma x- Since each atom 
in state | a) absorbs photons at rate R a b (I assume here that each upward transition 
corresponds to absorption of one photon), the total number of photons absorbed per 
unit time would be N a R a b , and taking into account that each absorbed photon carries 
energy fico ba — E h — E a , I conclude that the total absorbed power is fio) ba N a R ab • 
Similar arguments yield for the emitted power fico ba N b R ba , so that the total change 
in the energy density of the incident radiation u becomes 

du 

— =fio) ba (N b -N a )R ab (15.76) 

dt 

(emission increases the radiation energy and absorption decreases it). So, the overall 
result depends on the relative number of atoms in the lower and higher energy states: 
if N b < N a , the system absorbs light; in the opposite case, the system generates light. 
Normally, for the system in the thermodynamic equilibrium, more atoms stay in the 
lower energy state so that absorption of light is the expected normal behavior of any 
collection of atoms, electrons, etc. However, by using some clever tricks (which I 
cannot discuss here), it is possible to create a situation in which N b > N a and the 
system becomes a net emitter of light. This situation is quite unusual and cannot 
exist for systems in thermodynamic equilibrium; systems with this property are said 
to demonstrate “population inversion.” Population inversion is the main condition 
for operation of lasers and was realized first in ammonia molecules independently by 
American physicist Charles H. Townes and two Russian physicists, Nikolay Basov 
and Aleksandr Prokhorov, for which they all shared the 1964 Nobel Prize in Physics. 

However, one question here just begs to be answered. A thoughtful reader could 
say: “OK, let’s assume that in the absence of light there is, indeed, some distribution 
of atoms in the lower and higher energy states. But, once light starts generating those 
upward transitions, this distribution will start changing, and since the transition rates 
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for upward and downward transitions are the same, eventually we must come to a 
situation in which Nb = N a and no net absorption or emission will take place.” I 
will present this argument in a more formal way in the next section, and here I will 
only mention that it misses an important point. Stimulated emission is not the only 
process by which an atom can drop from the higher energy level to a lower one. 
Other possibilities are spontaneous emission, which will be discussed in the next 
section, and so-called non-radiative processes, when an atom gives away its energy 
to something else than radiation—elastic wave or kinetic energy of atoms in a gas, or 
what else. The presented calculation of the absorption coefficient makes sense only 
if these processes of energy “dissipation” occur faster than the optical transitions 
from the lower to higher energy level. Only in this case one can really talk about 
stationary distribution of atoms in various energy levels appearing in Eq. 15.76. An 
important lesson of this discussion is that even though we compute the absorption 
rate considering atomic transitions, the actual absorption (the dissipation of energy) 
occurs only when the atom gives off this energy in a form which can no longer be 
converted back into radiation. If you are wondering how can I claim that Eq. 15.76 
describes the absorption of light without considering all these processes in a more 
or less explicit form, the answer is that the assumption that the atoms are in thermal 
equilibrium ensures that all these dissipation processes are accounted for. This is 
also a beautiful illustration of the universality of thermodynamic equilibrium: it 
does not matter how the system comes to equilibrium: once it is there, its properties 
are described by the corresponding distribution functions. 

I will conclude this discussion with a derivation of expression for an absorption 
coefficient of light, which is defined based on the consideration of a beam of light 
of intensity I propagating in some well-defined direction. If z is the coordinate in 
the direction of the propagation of the beam, the absorption coefficient k is defined 
via the relation 


I(z) = I 0 e ~ KZ . (15.77) 

Differentiation of this expression with respect to the coordinate yields 

1 dl 

I dz 

Taking into account that the distance dz light travels in time dt = n m dz/c , where n m 
is the refractive index of the medium where light propagates, it can be rewritten as 

_ n m dl 
cl dt 

Remembering that the energy density of light is related to its intensity as I = n m cu , 
I can turn this into 


Tim du 

K =-— = hti>ba (Nb ~ N a ) B a b 


(15.78) 
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where I presented the transition rate as 


Rab — B abU(C0ba) 


with B a b being 


_ 7t [£ (OTOTo)] 2 

3£ofi 2 


(15.79) 


(15.80) 


Using Eq. 15.78 I can formulate for you a verbal definition of the absorption 
coefficient not related to the specific representations of intensity as in Eq. 15.77: 


. . energy absorbed/unit time/unit volume 

absorption coefficient = -. (15.81) 

total incident intensity 


You probably wonder if it was really necessary to introduce new quantities such 
as B a b or I am doing this just to annoy you. Well, I could have done without 
this coefficient, but it is really, really important, partially for historical reasons— 
it was introduced by Einstein and is still called the Einstein B coefficient. You 
see, we feel so much reverence for the guy that we do not even dare to change 
the notation he used in his original 1916 paper (more than a hundred years ago!). 
Actually, there are two such coefficients B a b and B/ m , defined as factors in front of 
the energy density term u in the corresponding expressions for the transition rates. 
In the absence of degeneracy in the initial states B a b = Bb a , otherwise comparing 
Eqs. 15.70 and 15.71, you can show that g a B a b = gb^ba • Besides being of historical 
and sentimental value, these coefficients also play a more serious role allowing to 
separate in the transition rate the effects dependent on the properties of the “atoms” 
from those dependent on the characteristics of the radiation. The usefulness of this 
feature will become especially clear in the next section. 


15.3.2.2 Spontaneous Emission Without Quantum Electrodynamics: 

Einstein Meets Planck 

Having gotten through the previous section, you might experience a vague and 
uneasy feeling that you are being duped. You might be thinking: how can one 
assume that the number of atoms in different states N a $ have stationary (time- 
independent) values and at the same time argue about emission and absorption 
processes, which obviously change them? Well, this is a legitimate question that 
deserves to be answered. The answer as you will see in a minute will not only justify 
arguments leading to Eqs. 15.76 and 15.78, but it will also demonstrate to you one 
of the most beautiful results of early quantum mechanics produced by Einstein as 
early as in 1916. 

First, I will drop the offending assumption that N a ,b are constant and will try to 
actually describe their time dependence. Here is what I know: transitions | a) -> | b) 
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result in a decrease of N a and simultaneous increase of N b . The reverse transitions 
| b) —> | a) obviously depopulate state | b) and populate state | a). The rate of all these 
changes is proportional to the rate of the respective transition R a b and the number of 
atoms in the corresponding states at any given time: 

N a B ab u + N b B ba u (15.82) 

N b B ba u + N a B ab u. (15.83) 

N a and N b now are instantaneous numbers of atoms in the respective states per unit 
volume, and I replaced the transition rates with their expressions in terms of the 
Einstein coefficients B ab and B ba to emphasize the presence of the radiation spectral 
energy density u. If you add these two equations, you get 

d(Ng+N b ) =Q 
dt 

which simply tells you that the total number of particles in both states remains 
constant. This is a somewhat trivial result reflecting the fact that an atom does not get 
vanquished while undergoing a transition from one state to another. The assumption 
of time-independent numbers N a , b amounts to a statement that these equations 
have stationary solutions in which the processes of population and depopulation 
of various states perfectly balance each other. In principle, Eqs. 15.82 and 15.83 do 
have a stationary solution in which time derivatives on the left-hand side vanish and 
N a B ab = N b B ba . In the absence of degeneracy, the Einstein coefficients cancel out, 
and we find that the stationary solution with constant N a and N b is possible only if 
these numbers are equal to each other. This cannot be right, of course, and for many 
reasons. For instance, if this were true, then Eq. 15.78 would predict zero absorption 
in all cases, which is nonsense. Obviously, something is missing in these equations, 
and a bit of rumination over them can bring you a happy guess: all processes 
included in the equations are stimulated (proportional to the energy density of the 
field). But what about the spontaneous emission—the one which is responsible for 
bringing atoms to their ground state spontaneously without any prompting by the 
already present field? This process would increase the number of atoms N a in the 
lower energy state at the expense of N b , which would obviously decrease. Since it is 
the atoms in the higher energy state, which undergo this spontaneous transition, its 
rate must be proportional to N b : 


dNa 

dt 

dN b 

dt 



— AabN b 


where following Einstein, I introduced a new coefficient A ab called the Einstein A 
coefficient. Note the absence of proportionality to the energy density u\ the rate of 
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this process does not depend on the existence of any radiation. Adding this term to 
Eq. 15.82, 1 obtain a corrected version of this equation: 


r — N a B a bU + NfoBbaU + A a bNb' 


(15.84) 


Now the stationary state with time-independent values of the population numbers 
emerges if 


b (A-ab + B ba u) — N a B a i,u. 


(15.85) 


This result does not look that egregious as the previous attempt—the ratio Nt/N a 
now depends on the radiation energy density and three coefficients, two of which 
we know, but the third one is yet an enigma. This is as far as I can go on without 
any additional assumptions about the atoms and the radiation. 

So, now imagine that the atoms are enclosed in a box maintained at fixed 
temperature T from which no atoms or radiation can escape. If you wait long 
enough, all the atoms and the radiation will come to a thermal equilibrium with each 
other and the walls of the box so that they all can be characterized by a common 
temperature T. Elementary statistical mechanics tells me that a probability for an 
atom (or any other system) to have energy E is proportional to exp (— E/k B T ), where 
k B is the Boltzmann constant. Applying this to our system of two-level atoms, I 
can write N a oc g a exp(-E a /k B T ) and N b oc g b e\p (~E b /k B T), where g aJ , are 
degrees of degeneracy of the corresponding level (if there are several states with the 
same energy, then the total probability for the system to have this energy must be 
multiplied by the number of such states). Then I can write that 



(15.86) 


Using Eq. 15.85 to express the electromagnetic spectral energy density in terms of 
the population numbers N a ,b, I have 


NbA a b 


A a b 


U = 



NaBab N})B]j a 


Substitution of Eq. 15.86 yields 


gbA a b 


(15.87) 
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You know that the energy density for radiation in the thermal equilibrium has been 
found by Max Planck in 1900. This was the first formula in this book, Eq. 1.1, but 
for your convenience, I will reproduce it here: 

fico 3 1 

(fe)-l' 

Equation 15.87 must agree with Eq. 1.1 because they describe the same quantity. It 
is encouraging that the exponential terms in the denominator of both expressions 
are identical, and I can make their entire denominators coincide by requiring that 
g a Bab = gbBba • Isn’t it remarkable that this reproduces the relation between the 
Einstein coefficients that I derived just a few lines above using nothing but honest 
to G-d quantum mechanical calculations of the transition rates? It turns out that the 
thermodynamics demands the same. Now, all that is left to make Eq. 15.87 look 
exactly like Eq. 1.1 is to require that 


A a ]y fl(0 

B ba 7t 2 C 3 ’ 


(15.88) 


which expresses the unknown coefficient A^ representing the rate of spontaneous 
emission in terms of the known quantity B\ a . Using Eq. 15.80 I finally find for 


A a b = 


[g (mmp)] 2 co 3 
3n£ofic 3 


(15.89) 


which agrees exactly with the result obtained by Dirac in his 1927 paper from his 
quantum theory of electromagnetic field. 

This is the point where you are supposed to feel genuinely amazed. Indeed, the 
spontaneous emission is a result of quantum fluctuations of the electromagnetic 
field, and I always maintained that it cannot be described without quantization of the 
field. And here we are: we found this coefficient using nothing but thermodynamics 
and regular quantum mechanics with classical electromagnetic field. This looks 
almost like a miracle and is an evidence of deep connections not just between 
different laws of physics but between different layers of reality described by them. 

I shall complete this section by computing the power emitted by a quantum 
harmonic oscillator via the process of spontaneous emission and comparing this 
result with the emission of a classical oscillator. Using Eq. 15.89 for the rate of the 
corresponding transmissions and Eq. 15.67 to compute the required dipole matrix 
elements, I find for the emitted power 


B sp — flCOQ A a fy — 


e 2 timo(DQ 

6Tt£ollC 3 


(15.90) 






544 


15 Emission and Absorption of Light 


To compare this with a power emitted by a classical harmonic oscillator, consider a 
free (not driven) oscillator with initial energy E c i. One can use a well-known form of 
electrodynamics Larmor’s formula for a power emitted by a charged particle moving 
with acceleration a: 


Pci = 


6nsoc 3 


(15.91) 


Acceleration of a harmonic oscillator is related to its displacement z as a = —C0 qZ, 
so that the expression for the power becomes 


C 6jt£oC 3 

This expression has to be averaged over the period of oscillations, which, as you 
hopefully know, 8 yields z 2 = Zq/ 2, where zo is the oscillator’s amplitude related 
to its energy as zl = 2E c i// jl(Oq. Thus, the final expression for the power of the 
oscillator’s radiation becomes 


g 2 <ao E c[ 
3ttsoc 3 ii 


(15.92) 


Replacing the term with E m — ficoo/2 in Eq. 15.90, you can see that the 

classical and quantum results for the power agree with an accuracy to the term 
—ficoo/2, which can be neglected in the classical limit E m » fico 0 /2. At the same 
time, this term makes sure that there is no spontaneous emission from the ground 
state. 


15.3.3 Optical Transitions in Semiconductors 

Optical transitions in semiconductors are another important area of application of 
Fermi’s golden rule. An important peculiarity of the energy spectrum of electrons 
in semiconductors is that the continuous spectrum of the allowed energy values is 
broken into a sequence of regions separated by intervals of energy values containing 
no states whatsoever—so-called forbidden bands. The actual number of allowed 
bands depends on the properties of atoms (mostly their atomic number) constituting 
a semiconductor, but as far as interaction with light is concerned, only two are of 
real importance: the last completely “filled” band and the first “empty” band (the 
existence of such “full” and “empty” bands is what distinguishes semiconductors 
and dielectrics from metals). The concept of the full or empty bands is closely 


8 For those who forgot: z(t) = zo coscoot, and cos 2 (coot) = (1 /T) fg cos 2 (coot) dt = 1/2, where 
T = 2n/coo. 
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related to the discussion of many-particle fermion states we had in Sect. 11.5 (it 
might make sense for you to return to this section). Recall that the Pauli principle 
requires the ground state of a many-fermion system to involve as many single¬ 
particle orbitals as there are particles. These orbitals, however, belong to energy 
eigenvalues, which are distributed between the bands, and when orbitals from the 
lowest bands are all used up, and there are still unassigned electrons, you have to go 
to the next band, and this process repeats itself until all electrons are accounted for. 
If the highest energy level assigned to the last electron lies somewhere in the middle 
of a band so that a spectral distance to the next available level is infinitesimally 
small (remember energy levels inside each band form a continuous spectrum), the 
material behaves as a metal (it takes a very small amount of energy, which can be 
provided by a static electric field to move an electron to a higher energy state where 
it can conduct current). If, however, the last filled level is also the last level in a band 
so that the next available state belongs only to the next band, the material behaves 
as a semiconductor or a dielectric. There is no sharp boundary between the two— 
the difference is in the magnitude of the band gap—for semiconductors it is usually 
much smaller and can range between 0.23 eV in InSb and 3.44 eV in ZnO , while 
band gaps of materials usually perceived as dielectric are much higher (e.g., the 
band gap of silica is 8.9 eV). The last filled band in a semiconductor is called valence 
band, and the first empty band is called conduction band. These names reflect the 
fact that valence electrons are usually firmly attached to an atom and cannot conduct 
electricity, while electrons excited to a conduction band become freer of their ions 
and can respond to an applied electric field. In a way, you can still think of this 
process as “ionization,” which I discussed in Sect. 15.2.3. The difference is that the 
valence band electrons also have a continuous spectrum and the conduction band 
electrons are not that free and cannot be simply described by free-particle wave 
functions. 

The curious band structure of electron energy levels in semiconductors is the 
direct consequence of the periodicity in the arrangement of atoms in these materials. 
Periodicity means that one can find a small group of atoms contained in such a 
box that the entire structure can be reproduced by translating this box by one of 
three non-collinear vectors at (I am going to assume for simplicity that they all are 
mutually perpendicular and can be used, therefore, to define directions of Cartesian 
coordinate axes). These vectors define the periodicity of the structure in the sense 
that if you take a coordinate of an arbitrary atom and add one of the vectors 
you will get a coordinate of another atom of exactly the same element in the same 
position with respect to other atoms, see Fig. 15.5. 

I am not going to torture you with any derivations related to the theory of wave 
functions describing the stationary states of electrons in such a periodic environment 
and their corresponding energy eigenvalues. What you need to know is that the 
wave functions representing states in both valence and conduction bands have the 
following form: 


1 

w 


Uv,c (r) e 1 


jkv, c r 


tyv,c (^u,c> f) — 


(15.93) 
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Fig. 15.5 Example of a 
periodic arrangement of 
atoms in what is called a 
crystal lattice 


gw 


These states are characterized by a discrete index, signifying a particular band 
to which they belong, and by quasi-continuous wave vectors k VtC , discretized by 
the same periodic boundary conditions as in Eq. 11.45. For the purpose of this 
discussion, I need only two bands represented in Eq. 15.93 by indexes v and 
c corresponding to the valence and conduction bands. Functions u v , c (r) in this 
equation are periodic: u v>c (r + a*) = u v , c (r), and V is the normalization volume 
defined by the same boundary conditions. For small enough k c ^ v (k c , v • 1), 

the energy of a single electron in the conduction and valence bands can often be 
approximated by 


fi 2 k 2 

E c = E g + ——- (15.94) 

2 m c 

fi 2 k 2 

E v = (15.95) 

2m v 

These formulas look somewhat like expressions for kinetic energy of a free 
particle but with two significant differences: the energy of the electron in the 
conduction band does not go to zero as k c —> 0 because of the extra term E g , 
and the energy of the states in the valence band is negative and decreases with 
increasing k v . Both these peculiarities reflect the presence of the gap E g in the 
spectrum of energies, and the supposition (valid for many semiconductors) that 
states with k v , c = 0 correspond to the maximum and the minimum energies in the 
valence and conduction bands correspondingly. If I choose the zero level of energy 
to coincide with the maximum energy in the valence band, then the lowest energy 
in the conductance band must be E g , which appears in Eq. 15.94. This choice also 
automatically makes all energies in the valence band negative, and they become 
more negative the farther away from k v = 0 you go. Parameters m c and m v have the 
dimensions of mass and appear in the place of a particle’s mass in many quantum 
mechanical formulas, but they are not equal to the regular electron mass m e . These 
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parameters are called effective masses and are usually much smaller than m e with 
m c often less than m v . 

With some preliminary groundwork laid down, I can start talking about optical 
transitions between states in the valence and conductance bands (these are the only 
relevant transitions for light with frequencies between infrared and visible). From a 
general point of view, this situation is an extension of the discussion of Sect. 15.2.3 
in two directions: first, now both initial and final states belong to continuous spectra, 
and second, we are dealing with an ensemble of fermions rather than with a single 
particle. First of these circumstances means that the conversion of the transition 
probability into the probability density started out in Eq. 15.37 and continued in 
Eq. 15.38 must now include two sets of quasi-continuous wave numbers for the 
valence and conduction bands: 


dpmmo = p mmo Arn 0 i Am 02 Amo 3 Am 1 Am2Am3 = 
3 l 2 

_ ^ / iA c ) iA v ) \ a b . \ _ 

1 ^2 ^*3 


(15.96) 


.(2 *Y 


Ps 0 ,s Ak[ c ^ Ak^ Ak{ c ^ Ak[ v) Ak^ Ak{ v ^ 


The many-body nature of the semiconductor transitions manifest itself first of 
all via the Pauli principle affecting the occupation of single-particle orbitals by 
identical fermions even if you can neglect the Coulomb interaction between them 
(see Chap. 11 if you need to refresh your memory of these ideas). Because of the 
Pauli principle, any single-particle orbitals can be either occupied (included into a 
determinant presenting a wave function of a many-particle electron state) or empty 
(not part of the determinant). Neglecting the interaction between the electrons, 
you can still describe the transitions in terms of the single-particle orbitals, but 


you have to take into account that a transition from 


(u) 

^rao 


)■ 


in the valence band to 


in the conduction band is now contingent upon two conditions: first, that the 


initial state is occupied and, second, that the final state is empty (you cannot put a 
second fermion in a state which is already occupied since trying to add to a Slater 
determinant on another row consisting of orbitals already present there will render 
the whole thing equal to zero). This circumstance turns probability dp mmo into a 
conditional probability, and according to the general theorem of the probability 
theory, in order to find the actual probability of the transition, you must multiply 
dpmmo by probabilities that both conditions are realized simultaneously. To make 
this correction, I need two probability distributions: one,/ v (k v ), is the probability 
of a given orbital in the valence band to be occupied, and the other, f c (k c ), contains 
the same information about the conduction band. In the continuous limit, both these 
distribution functions become probability densities. Since a given orbital can be 
either occupied or empty, the probability of it being empty is obviously 1 —f v , c (k VtC ) 
for both valence and conduction bands. Since any orbital can contain either one or 
zero electrons, integration over the orbitals weighted with the respective probability 
densities is equivalent to the summation over all electrons, so that the total transition 
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probability p vc from any valence band orbital to any conduction band orbital can be 
found as 


Pvc — 



X [Ps 0' s ( k c’ k v)fv (kv) (1 -fc (k c ))d 3 k v d 3 k c , 

SQ,S ** 


(15.97) 


where indexes so,s indicate the initial and final spin states of the electron. This 
expression contains two factors with very distinct physical meaning. Factor 
Ps 0 ,s (k c ,k v ) is your regular quantum transition probability, which we have been 
discussing in this chapter all along. Factors f v (k v ) (1 —f c (k c )) express a different 
kind of uncertainty, which is also inherent to the system under consideration but has 
nothing to do with quantum mechanics. It simply reflects the fact that in a system 
comprising many particles, detailed information about any particular particle is 
inaccessible. This situation is typical for all many-particle systems regardless of 
their quantum or classical nature. In classical statistical mechanics, the lack of 
knowledge of coordinates and momentums of individual particles is manifested 
in the form of Maxwell-Boltzmann distribution, while in the quantum case, the 
similar role is played by functions f v , c (k VtC ), which have the form of the famous 
Fermi-Dirac distribution: 



Parameter /x in this expression is called a chemical potential and is essentially a 
normalization parameter found by requiring that the integral of f(k) over all k yields 
the total number of particles in the system. I am going to spare you from the task of 
deriving this function (you can easily find several different derivations if you want), 
and I will not be using it in what follows. But I do want to note that as temperature 
T approaches absolute zero, the Fermi-Dirac function turns into a step function, 
which takes values equal to one for E < /x, and zero for E > /x. Indeed, in the 
former case, the argument of the exponential function is negative so that it vanishes 
in the limit T —> 0, while in the latter case, this argument is positive, making the 
exponential function infinite when the temperature goes to zero and vanquishing, 
thereby, the entire function /. Such a distribution describes the ground state of the 
many-fermion system with all states in the valence band filled with probability equal 
to unity and all states in the conduction band remaining empty. It does not mean, 
however, that parameter /x is necessarily equal to the energy of the last filled orbital. 
It would have been true for metals, but in semiconductors, the presence of the gap 
pushes the value of /x toward the middle of the band gap where no single-particle 
states can exist. 
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With the help of Eq. 15.38, I can present the total rate of transitions from the 
valence to conduction band as 


2tt 

Rvc = t 


(2tt) j J 


El f K,s (k v ,k c )\ 2 f v (k v ) (1 -f c (k c )) x 

^0 ,S ^ 

8 [tin - E c (k c ) + E v (k v )] d 3 k v d 3 k c , (15.98) 


where the only remaining discrete indexes refer to the two-spin states of the 
electrons. If the interaction between light and electrons is spin independent as is 
often the case, the perturbation matrix element is diagonal with respect to spin 
indexes and can be presented as v S0}S (k v ,k c ) = 8 SOfS v (k v ,k c ). In this case, the 
summation over the spin variables is reduced to an extra factor of two in the 
expression for the transition rate: 


4tt 


{2nY\ 


J |u (k v ,k c )\ 2 f , (k v ) (1 -f c (k c )) x 

8 [tin - E c (k c ) + E v (£„)] d 3 k v d 3 k c . (15.99) 


Since the matrix elements v (k v , k c ) are the same for the transitions in both 
directions, I can easily turn this expression into the rate for the reverse transitions 
(from the conduction to the valence band) by simply interchanging the distribution 
functions f v (k v ) and f c ( k c )\ 


Rr 


4;r 


-.2 


J 


/ 


V (k v ,k c )\ fc (k c ) (1 -/„ (*„)) X 

8 [hQ, - E c ( k c ) + E v (k v )] d 3 k v d 3 k c . (15.100) 


It follows immediately from these equations that at zero temperature, when all 
orbitals in the valence band are filled with probability one, and all orbitals in the 
conduction band are empty, the probability distribution functions become f v = 
l, f c = 0, and one can generate only absorbing transitions from the valence to 
the conduction band. Indeed, if the system is already in the ground state, it cannot 
further lower its energy by emitting light, be it a single electron or a whole bunch 
of them filling the entire valence band. The net transition rate for the non-zero 
temperature R net = R vc — R cv is given by 


\n 

Rn.pt — 7 


r 3 1 


* L (2tt) 3 _ 


J \v (k v ,k c )\ 2 (/„ (k v ) —f c (k c )) x 


8 [hn - E c (k c ) + E v d 3 k v d 3 k c . 


(15.101) 
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Normally, the occupation probability in the valence band is larger than in the 
conduction band f v ( k v ) > f c (k c ) so that the net flow of electrons is from the 
valence to conductance band, which results in net absorption of light. If, however, 
you manage somehow to achieve the reverse relation (a population inversion), the 
system will become a net emitter of radiation. I think I have already mentioned that 
a population inversion is one of the preconditions for achieving lasing. 

What is left for me now is to compute the transition matrix elements v (k v , k c ) 
and evaluate the integrals over the wave numbers. To achieve this goal, I first need, 
according to Eqs. 15.63 and 15.93, to calculate the vector dipole matrix element: 

d (k v ,k c ) = jj J e~ ,{kv ~ kc)r u* (r) (r)u c (r)d 3 r, (15.102) 

which determines v (k v , k c ) via 

v (k v , k c ) = • d (k v , k c ) . (15.103) 

Figuring out the integral in Eq. 15.102 takes effort and imagination, so fasten your 
seat belt—you are in for a ride. 

Remember that semiconductors are built by periodic repetition of a single box— 
elementary cell? Well, each cell can be characterized by a position vector R n of its 
center (or one of the corners—does not matter) defined by three integer numbers 
(relative to the same common origin as the integration variable r, see Fig. 15.5): 

Rni,n 2 ,Ti 3 — n\<l\ + ni a 2 + (15.104) 

where n t changes between 0 and = L/a^ a presumed integer, and is used 
to enumerate all these cells. You can think of the entire integration volume in 
Eq. 15.102 as being filled by N c = L 3 /(a^a?) = N\N 2 N 3 non-overlapping (and 
leaving no gaps) identical cells of volume v = The integral, then, can be 

presented a sum of integrals over each of these cells as 

Ni N 2 N 3 

d(k v ,k c ) = ^ YYY / e ~ ,(kv ~ kc)r K (r)ru c (r)d 3 r, (15.105) 

n\ n 2 n 3 

Un 

where v n is the region inside ?zth cell. (Here and elsewhere I will sometimes use 
n = {ni, ri 2 ,n^} as a compound index for the sake of brevity.) If you feel uneasy 
about this trick, consider its one-dimensional version: an integral from 0 to L can be 

presented as f^f(x)dx = f“f(x)dx + f^ a f(x)dx-\ - f ( ^_ l)a f(x)dx, where L = na. 

Next, take an arbitrary point inside a cell and define its position relative to its center 
by vector q, such that you can present the point’s position vector r as r = p + R n 
(see Fig. 15.6). 
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Fig. 15.6 Vectors involved in the transition from integration over an entire volume to the 
integration over the elementary cell represented by a single square 


Substituting this into Eq. 15.105 and taking advantage of the periodicity of 
functions u v , c ( r ): u VfC (p + R n ) = u v , c ( p ), I obtain 


N\ N 2 N 3 

d(k v .k c ) = E E E 

n\ =0 n 2 =0 « 3=0 
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n\ n 2 n 3 
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N, N 2 N 3 

EEE 


—i(k v —k c )R n 


R "J 


e -i(k v -k c ) P * 


(/>) u c (P) d 3 p. 


The last line in this expression vanishes because the wave functions \/f CfV (r), 
representing eigenvectors of a Hermitian operator, are orthogonal. 9 The integral in 
the second line 


9 For especially pedantic and inquisitive readers, I will note: yes you are right, the orthogonality 
condition must include integration over the entire volume, while the integral I claim is vanishing 
includes integration only over a single cell. However, you might also notice that such integrals 
over each elementary cells are all equal to each other, and the integral over the entire volume is 
their sum. Thus, the orthogonality condition in this case can be naturally formulated for the inner 
product defined over an integral over a single cell. 
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d(k v ,k c ) = J e Kkv kc)p u* v (p) pu c (p)d 3 p, 

Vn 

which I will call a reduced dipole matrix element, is taken over a volume of the 
elementary cell and is the same for all of them. Thus, it can be factored out of the 
sum over the cells, so that d (k v ,k c ) takes the following form: 

N\ N 2 N 3 

d (k v , k c ) = ^d (k v ,* C )EEE e-‘ ik ^ )R ”. 

n\ n 2 n 3 


All what is left now is to figure out the sum over the elementary cells. Using the 
representation of R n in terms of periodicity parameters of the crystal structure, 
Eq. 15.104, 1 can write 


N i N 2 N 3 
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n 2 =0 


E e 
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E e 

n 3 = 0 


where I assumed that vectors a\, c 12 , and a 3 point along X -, Y -, and Z-coordinate 
axes correspondingly. Each of these sums is a geometric progression with the first 
term equal to unity and coefficient q t = e ~ lAkiQi , (A k t = k Vi — k Ci ), with i taking 
values x, y, and z. A well-known formula for the sum of the geometric progression 
yields 
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,—iAkidin.i 
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and after combining all these sums, the matrix element becomes 


d(k v ,k c ) = j^d (k v ,k c ) f[ 


^ p -iAk iai Ni/2 s j n AkjOiNi 
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(15.106) 


For the transition probabilities and rates, I need |d (k v ,k c ) w\ 2 (in case you forgot 
w is the polarization vector of the electromagnetic wave): 


\d (k v ,k c ) -w\ = 
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Function 


F n (Midi) 


sm 


2 AkidiNi 


sm 


2 A kidi 


appears quite often in physics, for instance, in the theory of a diffraction lattice. 
For large values of Ni, it is characterized by a maximum at Ak t = 0 where the 
function takes the value of Nf. The larger the parameter Ni, the narrower the peak 
becomes and the more of the area bounded by the function lies in the small vicinity 
of A k t = 0. For small A k t , I can replace the sin function in the denominator with 
its arguments reducing F N (A k^i) to your old acquaintance, the sine function that 
you first encountered during the derivation of Fermi’s golden rule in Sect. 15.2.1: 


F n (A kidi) 


in 2 AkiaM 
(AM) 2 


= Nfsinc 2 


^ AkidiNi ^ 


Using Eq. 15.26 with v identified with Akid t and a with N i9 1 can write 


lim F n (A kidi) = 2jtN,S (A km) = (Ait,) , (15.107) 

Ni->o o di 

where at the last step I used a known scaling property of the delta-functions: 
8(dx) = 8{x)/d (I apologize for having too many ds with different meanings in 
the last three lines and hope you won’t let yourself get confused by them). Taking 
into account Eq. 15.107, and identifying 8 (A k x ) 8 (A k y ) 8 (A k z ) with 8 (k v — k c ) 
(recall Eq. 2.39), I find for |d (k v ,k c ) w | 2 : 


\d(k v ,k c ) ■ w\ 2 
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k c ). 


(15.108) 


Take a breath and pat yourself on the shoulder—you wandered into a rugged terrain 
and came back mentally in one piece. From this point on, it will be all down the 
slope. First, use Eq. 15.103 to find the perturbation matrix element \v (k v ,k c )\ 2 , 
substitute it into Eq. 15.101 for the net transition rate, and get rid of the integration 
over one of the wave numbers using the delta-function: 
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Now you can go back to Eqs. 15.94 and 15.95 and write R net as 
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(15.109) 


where I introduced the reduced effective mass /x, l//x = \/m c -\- \/m v , and 
remembered that the distribution functions depend only on the magnitude of the 
wave number k by the way of the corresponding energies E v and E c . The last 
comment is important as it allows me to carry out the integration over directions 
of k in Eq. 15.109 using the spherical coordinate representation for k. Neglecting 
a weak k —dependence of the matrix element d (A:, k) —and replacing it with the 
constant vector d , I can evaluate Eq. 15.109 to be 
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where I took into account that the integral over angular variables yields 4 tt. To 
evaluate the remaining integral over k , I make a substitution of variable € = 
fi 2 k 2 /2fi that yields 
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(fv (k ) -fc (k )) 8 [tin - Eg - e] sfede = 


mi 

[/„ (bn - E g ) -f c (bn - Eg)] yjnn - E g e (nn - E g ), 


where the step function 9{x) indicates impossibility to fulfill the condition fiQ — 
E g = € > 0 unless fiQ > Eg. This condition reflects a physically quite 
obvious circumstance: it is impossible to cause a transition from the valence to 
the conductance band if the energy of a respective photon is less than the energy 
required to bridge the band gap. Light with frequencies not satisfying this condition 
does not interact with electrons of the semiconductor merely passing through. 
Therefore, semiconductors remain almost transparent to frequencies smaller than 
Eg/fi. (Almost because there are always other mechanisms for light absorption not 
accounted for by this model.) 
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The transition rate in this expression is proportional to the number of elementary 
cells in the structure, i.e., to its volume. A more physically relevant quantity would 
be a number of transitions per unit time per unit volume, which is 


R net e 2 £oM 3/2 j 2 

- dw X 

L V2n 2 ti 4 v 2 

[/„ (hQ - Eg) -f c {hQ, - Eg)] JhSl - ~E g e (M 2 - Eg), ( 15 . 110 ) 


where I replaced N c /L 3 = 1/u. This result now can be used to find an expression 
for the absorption coefficient in the semiconductor, but I leave it for you as an 
exercise. Final remark: if you are dealing with an unpolarized light, the averaging 
over polarization might be needed, which as we know will result in the replacement 

'/3. 



2 


d • w 


d 


15.3.4 Selection Rules: Dipole Matrix Elements Made Easier 

Dipole matrix elements d m0ym = (a mo \er\a m ) play the key role in determining 
probabilities of various optical transitions, and their computation constitutes the 
largest chunk of what researchers studying emission and absorption spectra of 
various quantum systems do. Quite often knowing the symmetry of the unperturbed 
Hamiltonian, you can determine if a matrix element in question vanishes or has 
a non-zero value without any actual calculations. Obviously, this knowledge saves 
lots of time and effort, and this is why physicists pay a great deal of attention to so- 
called selection rules. These are the rules which allow you to determine the matrix 
element of which transitions have a finite (non-zero) value and for which it vanishes. 
In the former case, the transitions are called dipole-allowed, and in the latter, they 
are called dipole-forbidden. Usually, if you go beyond the dipole approximation, the 
new perturbation matrix element might have a non-zero albeit a rather small value, 
making the probability of such transitions small but still different from zero. 

Derivation of the general selection rules is quite involved and requires knowledge 
of mathematics (theory of groups) which is not expected from the readers of this 
book. However, selection rules for dipole transitions between states represented by 
hydrogen-like wave functions can be derived using more elementary methods. In 
this section I’ll show how this can be done, and in the end, I will fulfill my promise 
and finish the computation started in Eqs. 15.72 and 15.73. 

The selection rules for d mo , m are determined by the symmetry of this expression 
with respect to inversion and rotations. As I have already explained at least twice 
in other parts of the book, in order to utilize the symmetry, you have to consider 
how constituent parts of the matrix element—the state vectors and the respective 
operator (in this case r )—change when a symmetry transformation is performed. 
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If these changes keep the resulting matrix element unchanged, then there is nothing 
you can say about it—it might or might not be equal to zero, but you will have to 
actually compute the integrals to find out its actual value. If, however, the changes 
of the vectors and the operator are such that the resulting matrix element, which 
must remain invariant, appears to change, you would encounter a paradox, which 
can only be resolved if the matrix element in question is zero. 

The first of the selection rules can be derived by considering the effect of 
inversion—the symmetry operation described by parity operator El. I have already 
analyzed the behavior of matrix elements with respect to inversion in Sect. 7.1.2 
and the beginning of Chap. 10, so here I will only remind you of the results. Since 
the position operator r (and the momentum operator p) are odd operators (they 
change sign under the action of II; see Eqs. 5.25 and 5.26), the corresponding matrix 
element will vanish unless it is calculated between states with different parity. In 
application to the hydrogen-like atoms, this selection rule means that the dipole- 
allowed transitions are only possible between states with odd and even values of 
the orbital quantum numbers /. This follows from the properties of the spherical 
harmonics, which are even for even / and odd when l is odd; see Sect. 5.1.4. 

The rest of the selection rules are determined by the rotational symmetry of the 
system, but a direct application of the transformation properties of the eigenvectors 
with respect to rotation is again outside of the mathematics background expected 
from the readers. Therefore, I will address this problem relying directly on the 
properties of the spherical harmonics and operators of the angular momentum. So, 
to make sure that we all are on the same page, let me reiterate: I am interested in the 
properties of the matrix elements of the form 


(no, lo,mo\r \n, /, m), 


where n, /, m are standard notations for principal, orbital, and magnetic quantum 
numbers correspondingly (obviously, the absence of the electric charge in this 
expression is inconsequential). I am looking for relationships between orbital and 
magnetic quantum numbers of the initial and final states, which would ensure 
that the matrix element does not vanish because of the symmetry requirements. I 
hope you will not get confused by my use of m to represent two different things: 
sometimes it is a generic index enumerating various states of the system, and 
sometimes m is a magnetic quantum number defining an eigenvalue of the operator 
L z . I hope that you will be able to differentiate between the two usages of this symbol 
from the context in which it is used. 

The required relations between magnetic quantum numbers mo and m are easiest 
to establish. Indeed, these numbers appear in the matrix elements only in the form 
exp [i (m — mo) <p\, which is a part of the spherical harmonics F/ ?w (6, <p), multiplied 
by x-, y-, or z-component of the position vector r. Therefore I can write for the 
Cartesian components of the dipole matrix element: 


15.3 Semiclassical Theory of Absorption and Emission of Light 


557 


2 7t 

dx J cos cpe^ 0 ^ dcp oc 8 m — mo -|-i -1- , 

o 

2 TC 

d y <x J sin (pe l{m ~ m>),p d(p oc <5 m _ mo+1 - 8 m - mo - U 
0 

2ji 

d z oc /«*-**« 4^. 

0 

where I omitted the parts of the integrals that do not contain the azimuthal angle (p 
(those bothersome factors with radial functions and associated Legendre functions 
and even cos 6 or sin# parts of the position vector components) and took into 
account that an integral of exp ( irmp ) over the interval [ 0 , 2 tt ] with integer m is 
zero unless m — 0. These results tell me that the selection rules for different 
components of the dipole matrix element are different: for v- and y-components, 
the matrix elements are not zero only if the magnetic numbers of the initial and final 
states differ by one, \m — mo\ = 1, while for the z-component, it is not zero only if 
m = mo. This difference manifests itself experimentally by the different reactions 
of the atom to light of different polarizations and propagation directions. 

At this point a thoughtful reader should say: “Wait a second, professor! If the 
system is completely isotropic as a hydrogen atom is, then it should make no 
difference how we choose all axes of the coordinate system, including its Z-axis. 
In this sense, a light of any linear polarization can be made polarized along any 
axis—we could choose a Z-axis to go along the propagation direction of the wave, 
and X-axis (or T-axis) along the electric field of the wave. This choice would make 
the wave induce transitions between states with magnetic numbers different by one. 
If, however, you, on a whim, decide to direct a Z-axis along the electric field, then 
the induced transitions would be between the states with the same magnetic number. 
And how does all this make any sense? The nature wouldn’t care about the choices 
of the coordinate system we make, would it?” This, of course, would be a very fair 
question, but I do have a very good answer to it. First, you need to remember that all 
states with the same magnetic number have the same energy, they are degenerate. 
Therefore, the transitions with or without change in m would occur at the same 
frequency, and you cannot distinguish between them in the absorption or emission 
spectra. Second, the definition of m and of the state with given m is directly attached 
to the direction of the polar axis. Changing the direction of the latter, obviously, 
would not change the actual quantum state (the nature indeed cares very little about 
our choices of the coordinate systems), but it will change your description of it: the 
same state, which in one system is an eigenvector of the operator L z characterized 
by a certain value of m, will now be described as a superposition of eigenvectors 
of the operator L~ z , defined with respect to a new polar axis, with different magnetic 
numbers. 
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In order to really observe the difference between transitions with or without 
change of magnetic number, you would need to spoil the symmetry enough to lift the 
degeneracy of the states, but not too much so as not to destroy the dependence of the 
wave functions upon cp. This can be achieved, for instance, by placing an atom in a 
uniform magnetic field. The resulting Hamiltonian as you know still commutes with 
L z , if the Z-axis is chosen along the magnetic field, so the eigenvectors of the latter 
remain eigenvectors of the Hamiltonian and retain their exp (inup) behavior. The m- 
selection rules, which were based completely on this dependence on cp, remain valid 
even in this case, but the direction of the Z-axis is now not arbitrary but set by the 
magnetic field. By directing the incident light along the field (its polarization will 
obviously be in X- or T-directions), you can observe its absorption due to m m± 1 
transitions. The transition between the states with the same magnetic numbers can 
be observed using light polarized along the magnetic field. 

The m-selection rules can also be derived in a more elegant way without reliance 
on a direct computation of integrals in the position representation. It is easy to see 
that operators L z and z commute (L z contains only operators x, y and correspondingly 
p x and p y , all of which commute with z). So take the identity L z z = zL z and multiply 
from the left by the bra version of the eigenvector of L z (h,m\ \ and from the right 
by its ket version |/ 2 , m 2 ): 


(h,m\\L z z\l 2 ,m 2 ) = (l\,m\\zL z \l 2 ,m 2 ). 


Using hermiticity of L z , you can apply it to the bra vector on the left-hand side of 
this expression and to the ket vector on its right-hand side: 


fim 1 | z |/ 2 , m 2 ) = fim 2 (h,m\ \ z |/ 2 , m 2 ). 


This expression can only be true if either m\ = m 2 or (l \, m\ \ z |Z 2 , m 2 ) = 0, which 
is exactly our selection rule for the z-component of the dipole matrix element. This 
analysis shows a way to derive the selection rule for two other components of the 
dipole matrix elements, but I will leave it for you as an exercise. 

I will complete this section by deriving the last set of the selection rules 
establishing limitations on the orbital quantum numbers / of the initial and final 
states of the dipole-allowed transitions. This would require analyzing integrals with 
respect to the polar angle 0, which can be written in the following form: 


JC 

d X y a J P™ o ° (cos 9) sin 0P™ 0±1 (cos 9) sin 9d9 
0 

71 

d z a J P™ o ° (cos 9) cos 9P^ (cos 9) sin 9d9. 


(15.111) 


(15.112) 
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The structure of these expressions is as follows: the associated Legendre functions 
are, of course, the #-dependent part of the spherical harmonics for the initial and 
final states with m-selection rules incorporated, the sin # or cos # factor between 
them represent the #-dependent part of the components of the position vector 
expressed in the spherical coordinates, and, finally, the last sin # factor is a part 
of the spherical differential volume element dV. A convenient way to compute 
these integrals is to note that trigonometrical functions can be expressed in terms 
of the associated Legendre polynomials as cos 9 = P J (cos 9), sin 9 = —P\ (cos 9), 
turning integrals in Eqs. 15.111 and 15.112 into integrals of the product of three 
associated Legendre polynomials. These integrals can be evaluated using a widely 
known (among a narrow circle of people) so-called Gaunt’s formula. I am not 
going to reproduce it here because of its rather cumbersome form and easy online 
availability (if you are interested, you can find it on a Wikipedia page for associated 
Legendre polynomials). Moreover, for the examples you will actually be dealing 
with in this book, it is easier to compute these integrals from scratch. However, I 
will use the conditions on numbers / and m, specified by this formula, which ensure 
that the respective integral does not vanish. First, it is required that the largest of 
m be equal to the sum of two other magnetic numbers. In our case this condition 
is consistent with m-selection rules, which is easily verifiable by looking at the 
corresponding equations. In Eq. 15.111 the sin# corresponds to m = 1, which 
is a perfect match for the m-values of two other Legendre polynomials, while in 
Eq. 15.112 the cos# corresponds to m = 0, again in agreement with the already 
established selection rules. Second, it is required that the / numbers of the Legendre 
functions obey the “triangle rule,” in which one of the numbers is less than the sum 
of two others and larger than their difference (like for the sides of a triangle). In our 
case it means 


/+ 1 > h > /- 1 , 


leaving me with only three possibilities Iq = l or Iq = l =b 1. The first of them is 
excluded by the parity argument, so that you are left with the last of the selection 
rules: dipole transitions are only allowed between the states with azimuthal quantum 
numbers differing by unity, | / — /o| = 1 , which is, of course, consistent with the 
inversion symmetry-based requirement that initial and final states have different 
parities. If you are wondering if this result can be obtained in a different way without 
a referral to an obscure formula too big even to be displayed here, the answer is yes. 
One can derive it using arguments similar to the one I showed for m-selection rules, 

but it involves computing commutator L , L , r J J of a rather obscure and non- 
intuitive origin, so the whole approach does not look that elegant anymore. You 
can find it in a popular quantum mechanics textbook by D.J. Griffiths. 10 Another 
way to derive this result is based on the Wigner-Eckart theorem, which deals with 


10 D.J. Griffiths, Introduction to Quantum Mechanics , 2nd edn. (Cambridge University Press, 
Cambridge, 2016). 
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the generic properties of matrix elements involving eigenvectors of the angular 
momentum, but exposing you to it could be construed as a cruel and unusual 
punishment. 

Now I can finish computing the dipole matrix elements appearing in Eqs. 15.72 
and 15.73, which are 


g x = (1,0,0| er |2,0,0), 

g 2 = (1,0,0| er |2,1, —1), = (1,0,0| er |2,1,0), ^ 4 = (1,0,0| er |2,1,1). 


By the parity argument (or /-selection rule), you can immediately conclude that 
g x = 0, while all other matrix elements might have nonvanishing values. For g 2 and 
g 4 , the m-selection rule yields the non-zero values for their x- and y-components, 
while for g 3 only the z-component does not vanish. Now, let’s go ahead and compute 
them using the known expressions for the hydrogen wave functions from Chap. 8. 1 
will need the following (assuming Z = 1 and vacuum): 


1 , 0 , 0 ) = —( — } 
V7T \dl 


3/2 


0 -r/a B 


(2b J 

3/2 


1 f \ \ i/z r 

|2,1, ±1) = —— — — e~ r ' laB sin 9e ±iv , 

8v^ V<W a B 


i (i V /2 

|2,l,0) = —= - - 

4V27T \a B J a B 


- r/2ae cos 6, 


which gives me 


T 2 , = 


2n 


8tt a 4 B 


J drr 4 e 3r / 2aB J dO sin 3 0 J dtp cos cpe 1<P = 


1 2564 4 128 

— 7i = - ea B , 


Sir a 4 81 3” 243 


2jt 


<T 2 y 


8 na 4 B 


o 


drr 4 e 3r ! 2aB J d0 sin 3 6 J d(p sin cpe l(p = 
o o 


1 2564 4 128 

—i - t -- 7i = —i - eciiu 

8 no 4 81 3 243 


and 


\$ 2 1 2 — |^2j 2 + \$ 2 } 


2 


^ 0 .55e 2 a 2 B . 
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Now, you can easily notice (using the expressions for the corresponding wave 
functions) that g 4 = which means that v-components for both matrix elements 
are identical, while their y-components are complex conjugates of each other and 
|£ 4 1 2 = 1 5 * 2 1 2 - And, finally, the last of the non-zero elements 


S*3 Z 


4V2 7t 


— f drr A e~ 3r/2aB f 

at J J 

B 0 0 


dr/e~ 2r/2aB d6 sin 6 cos 2 0 


2564 2 64 

- 71 = - eciB 

81 3 243 

|^ 3 | 2 % 0 . 14 e 2 a 2 , 


and I can finish the computation of the transition rates in this example. For upward 
(absorbing) transitions 


R 


E 1 ,E 2 


((Q Ei ,e 2 ) e 2 a 2 E 
3 s 0 fi 2 


(0.55x2 + 0.14) = 2.48 


7tu(co El ,E 2 )e 2 al 

3£oh 2 


while the rate for downward transitions due to stimulated emission is eight times 
smaller due to the difference between degrees of degeneracy of the initial and final 
states for each direction of the transitions. 


15.4 Problems 
For Sect. 15.1 


Problem 175 Consider a one-dimensional harmonic oscillator (mass m e , frequency 
coo) acted upon by a uniform force with time dependence of the form 

Fit) = F 0 r8 {t - t 0 ). 


1. Assuming that the oscillator is initially in a ground state, find the probability that 
it will be found in an arbitrary state | n) at time t > to using first and second orders 
of the time-dependent perturbation. 

2. Solve Eq. 15.4 exactly (not using the perturbation theory) with the same initial 
condition. 

Problem 176 Consider an electron in a one-dimensional infinite potential well V (z) 
subjected to additional potential of the form 


V(x, t) 


Fz'- 

0 < t < x 

Fz 

t > X 
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1. Find a probability of transition between a state belonging to the second energy 
level and states belonging to the first and the third energy levels. 

2. Write down an expression for the time-dependent state of the electron in the first 
order of the perturbation theory assuming that initially it is in the ground state, 
and use it to compute expectation values of coordinate and momentum of the 
particle (keep only linear in the perturbation terms). 

(r) 

Problem 177 Derive an expression for c m —an arbitrary-order approximation for 
the time-dependent expansion coefficients c n (t). 

Problem 178 Consider an electron in the ground state of a three-dimensional 
symmetrical infinite potential well of width a in all dimensions subjected to a 
perturbation: 


V(t) — Fq (x -\- y z) 0 ( t ) 9 (r — t) . 

Find a probability that after the perturbation is over, you will observe an emission 
from this electron at a frequency co 2 \ = (E 2 — E\) /fi, where E 2 and E\ are the 
second and the first lowest energy eigenvalues of the electron. Do not forget that the 
eigenvalue E 2 is degenerate. For which r will this probability be largest? 

Problem 179 Repeat Problem 178, assuming that the electron is in an infinite 
spherical potential well, while the perturbation has the same form. 

Problem 180 Vectors \a m ) in the derivation of the transition probabilities might 
represent states of many-particle systems as well. Thus, consider a system of two 
non-interacting electrons allowed to move only in z-direction in a one-dimensional 
potential well of width a. They are subjected to a pulse of the uniform electric field: 

£ = £{o cos Qt9 ( t ) 9 (r — t ). 

1. Assuming that the electrons are initially in the ground state, find a probability of 
transition to the lowest in energy excited state. Do not forget that electrons are 
fermions and that they have spin. 

2. How will the answer change if somehow spins of both electrons are always kept 
in the spin-up state? 

Problem 181 Consider a pure spin 1/2 (no orbital degrees of freedom to worry 
about) placed in a uniform constant magnetic field B in the direction of the Z- 
axis. The spin is also subjected to a weak “rotating” magnetic field B±(t ) = 
bo (e x cos Qt + e y sin Qt), where e XJ are unit vectors in the directions of X- and 
T-axes. Considering the “rotating” field as a perturbation, 

1. Find the time-dependent spinor describing the quantum state of this spin in 
the first order of the time-dependent perturbation theory, and compute the 
expectation values of all three spin components. 
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2. Assuming that the perturbing field was turned off at time t = T and z-component 
of the field is measured, find the probabilities of various outcomes. Repeat the 
same if the v-component is measured. 

Analyze the obtained results as a function of parameter QT. In all cases assume that 
initially the spin was in its ground state. 

Problem 182 Consider a hydrogen atom initially in the ground state subjected to a 
perturbation: 


V = £oze~ Wr . 

Assuming that “initially” corresponds to to = — oo, modify Eq. 15.12 to account for 
this new value of to and compute probabilities to find the atom in the ground state at 
t oo in the first order of the perturbation theory. 


For Sect. 15.2.1 

Problem 183 Compare the result of the first-order perturbation theory for the 
transition probability in the case of monochromatic perturbation (Eq. 15.23) with 
the exact solution for the two-level system from Chap. 10. Derive an approximation 
for Eq. 10.32 valid for short time intervals, and compare it with the results of the 
perturbation theory. State the condition to which the duration of the perturbation 
must obey for the probability expression to yield a reliable result. 

Problem 184 Assuming monochromatic perturbation, derive an expression for the 
transition probability in the second order of the perturbation theory. Analyze its 
behavior at t —> oo. 

Problem 185 Re-derive Eq. 15.23 assuming that the perturbation operator has a 
time dependence in the form of the sine instead of cosine function. 

Problem 186 Consider a spin 3/2 particle in a uniform magnetic field in z- 
direction: B = Boe z . At time t = 0 the particle is subjected to a time-dependent 
magnetic field in the y-direction: B\ = bosinQt. Assuming that the particle is 
initially in the ground state, find the transition probabilities for all three excited 
states expressed in the delta-functional form of Fermi’s golden rule. 

Problem 187 Consider a one-dimensional harmonic oscillator with time- 
dependent frequency: 


co = co 0 (1 + A cos Qt ). 
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1. Present the Hamiltonian of the system in the form Ho + V(t), and identify the 
form of the time-dependent perturbation Hamiltonian. Rewrite it using lowering 
and raising operators. 

2. Assuming that the oscillator is initially (at t = 0) in the nth state, determine 
transitions into which states are possible and find the corresponding probabilities 
in the limit of small t and t oo. 

3. Write down the time-dependent state of the oscillator, and compute the expecta¬ 
tion value of the Hamiltonian in it. 

Problem 188 Consider a system with a quasi-monochromatic perturbation opera¬ 
tor: 


V(t ) = ve €t cos Qt6(t). 

1. For the perturbation of this form, derive the expressions for the transition 
probabilities in the first order of the perturbation theory for an arbitrary time 
t. 

2. Find the limiting value of these probabilities in the limit t —> oo. 

3. Consider the behavior of these limiting values when 6^0. Compare the results 
with the delta-functional form of Fermi’s golden rule. 


For Sect. 15.2.3 


Problem 189 Consider a particle in a one-dimensional 8 -functional potential well: 
V = —ao [1 + A cos (£2t) 6(t)\ <5(v). 

Find the probability that the perturbation will kick out the particle from its ground 
state. Do not assume that the state in the continuum are just plane waves—find the 
true scattering states in the delta-functional potential. 

Problem 190 Consider a gas consisting of non-interacting potassium atoms with 
concentration n Na = 10 12 m -3 . First, find the electron configuration of a potassium 
atom (distribute all electrons among available orbitals) and, assuming that a lone s- 
electron in the last shell behaves as an electron in a quasi-hydrogen atom, determine 
its energy. The gas is being exposed to a uniform time-periodic electric field of 
frequency Q which ionizes the atoms by kicking out this electron to the continuous 
segment of the spectrum. Find for which frequency Q the number of free electrons 
generated per second per unit volume is largest, and find this number. 
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For Sect. 15.3.1 

Problem 191 Sometimes the dipole matrix elements in the expression for the 
transition rate vanish, in which case other weaker interactions between light and 
atom start playing a role. One of them is the magnetic dipole interaction described 
by a perturbation term: 


V = ——5(0, t)n B ■ (L + 2 s) , 

where tib and B(0,t ) are the direction and magnitude of the magnetic field 
component of the electromagnetic wave, which I described in Eq. 15.56 by its vector 
potential. The magnetic dipole approximation consists in taking the value of the field 
at r = 0. It can be shown that the part of this operator containing the orbital angular 
momentum L originates from the first correction to the dipole approximation (linear 
in the for term in the expansion of the exponent exp ( ifor )), but I will spare you the 
derivation. This expression is similar to the Zeeman term in atomic Hamiltonian, 
Eq. 14.34, but, unlike the latter, is time-dependent. 

1. Using Eq. 15.56 find the expression for the magnetic field in this wave and relate 
vector ns to the polarization vector w. 

2. Derive a general expression for the perturbation matrix element v m(hfn for the 
magnetic dipole perturbation term (do not specify the nature of the initial and 
final states). Index m here (and in the next problem) is just a generic index 
enumerating states, and not a magnetic quantum number. 

Problem 192 Derive a dipole transition matrix element r> mo , m if the incident 
electromagnetic radiation is described by the vector potential of the form 

£ 

A (r,t) = —— [A x e x cos (kz — £2t) + A y e y cos (kz — Sit + cp )\, 

where e XJ are unit vectors and Ja^ + Aj = 1. Find a dependence of the matrix 
elements of the parameters <p and A Xty . (Hint: Present the trigonometric functions in 
the complex exponential form.) 


For Sect. 15.3.2.1 

Problem 193 Using Eqs. 15.70 and 15.71, demonstrate that Einstein coefficients 
B a b and B ba are related to each other via g a B a b = gbBba> where g a ^ are degrees of 
degeneracy of the initial and final states | a) and | b). 

Problem 194 Find Einstein’s B ab and B ba coefficients for transitions between the 
second and third excited levels of a particle in a symmetric three-dimensional 
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infinite cubic potential well. Find the transition rates, taking into account the 
degeneracy of both the final and the initial states. 

Problem 195 Consider an electromagnetic wave propagating through a gas of 
“pure” spins 1/2 placed in the constant uniform magnetic field B with concentration 
N spins per unit volume. This situation can be realized, for instance, in the case of 
an electron in an atom in a ground state if the frequency of the wave is too low 
to cause any transitions except of those between the spin states. If this is the case, 
you can ignore all quantum numbers of the electron and treat it as a “pure” spin. 
So the magnetic field of the electromagnetic wave induces transitions between two 
energy levels corresponding to eigenvectors of the component of the spin operator 
in the direction of the permanent magnetic field B. Assume that the spins are in 
the thermal equilibrium, meaning that the ratio of the number of spins in the higher 
energy state Nb to those in the lower energy state is 



Determine the absorption coefficient in this system. Assuming that the magnitude 
of the permanent magnetic field is 1 T, and the concentration of spins N = 10 15 per 
unit volume, find the numerical value of the absorption coefficient as a function of 
temperature. How does this absorption behave as the temperature goes up? 

Problem 196 Find the stimulated emission and absorption transition rates between 
the second and third excited states of a three-dimensional isotropic harmonic 
oscillator (mass /z, frequency co). Do not forget to take into account the degeneracy 
of both states. For numerical estimates, take /z to be equal to the reduced mass of a 
molecule of CO 2 , and for frequency co = 7.7 x 10 7 rad/s. Assume that the molecule 
is exposed to the black-body radiation at temperature T = 300 K. 

Problem 197 Find Einstein B coefficients for a magnetic dipole transition (see 
Problem 191) between states of hydrogen atom |2, |, l) and |2, l) split by the 
spin-orbit interaction (see Sect. 14.1), where the nomenclature for the states is in 
| n,j,l), where n is the principal quantum number and j and / are the total and orbital 
angular momentum numbers. 


For Sects. 15.3.2.2-15.3.4 

Problem 198 An electron in an infinite one-dimensional potential well is excited 
to the third energy level and is left alone to return to the ground state by the way 
of spontaneous emission. This can be done via two different routes: |3) -> |1) and 
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13) -> 12) -> 11), where I used notation | n) to designate the state belonging to the 
corresponding energy level. 

1. At which frequencies can the spontaneously emitted light be observed? 

2. What is the spontaneous emission rate for each frequency? 

3. Find the rate at which the number of atoms in the ground state increases. 

Hint: Remember that if a random event is a sequence of two independent events 
occurring consequently (one after the other), the total probability is the product 
of the probabilities of the individual events. If, however, an event can occur via 
several alternative paths, the total probability is the sum of probabilities of each 
alternative. 

Problem 199 Consider a hydrogen atom excited to |3,2, —1) and |3,1, — 1) states, 
where a standard notation for hydrogen stationary states is used (spin-orbit interac¬ 
tion is neglected). 

1. Describe all possible channels by which the atom can decay to the ground state, 
and find probabilities for each channel. 

2. Find the lifetime of these states. (Do not forget to consider all possible ways for 
these states to decay, but take into account that once the atom transitioned from 
its initial state, it is no longer in that state, so for the lifetime calculations, only 
the first steps in each chain are relevant.) 

Problem 200 Using Eq. 15.110 and the definition of absorption coefficient, 
Eq. 15.81, derive an expression for the latter for a semiconductor. Assume that 
you are dealing with GaAs (band gap E g = 1.43 eV, effective mass in conductance 
band m c = 0.067m e , effective mass in the valence band m v = 0A5m e , where m e 
is a regular mass of a free electron) at temperature T = 300 K. Find the numerical 
value of the absorption coefficient for light at frequency fi£2 = 1 .5E g . 

Problem 201 Derive the magnetic quantum number selection rules for light polar¬ 
ized in the x- and y-directions using commutation relations between operators x, y 
and L z . 

Problem 202 Determine the selection rules for magnetic dipole transitions 
between hydrogen-like states with its given azimuthal, magnetic, and spin quantum 
numbers. Show that the magnetic dipole transitions between the states with different 
principal numbers are forbidden. (Hint: For this you would have to consider the 
radial part of the matrix element integrals.) 

Problem 203 When a dipole matrix element vanishes because of the dipole 
selection rules and the rate of the corresponding transition goes to zero, it does 
not mean that such transitions could never occur, but one has to go beyond the 
dipole approximation in order to find the mechanism responsible for such so-called 
forbidden transitions. By keeping the linear in kr term in the expansion of the vector 
potential in Eq. 15.56, one would find two new transition mechanisms: the magnetic 
dipole transition (Problems 191 and 202) and a so-called quadrupole transition 
whose contribution to the perturbation potential can be written down as 
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V = 


e£o 

7m e £l 



+ e~ iQ ‘] , 


where 


Qij = nrj - ^Sijr 2 

is a so-called quadrupole moment—a 3x3 matrix with zero trace: J]- Qu = 0. 
In the definition of the quadrupole moment, r\ t 2,3 corresponds to the Cartesian 
components of the position vector x, y, z, while r = s]x 1 + y 2 + z 2 is its magnitude. 
Correspondingly, the matrix element, which determines the fate of all transitions, 
becomes in this case 


^ main — 


e£p 
2 m P £l 


y ] {^mo I Qij 


' w i k J 


Determine the selection rules for the quadrupole-allowed transitions assuming 
hydrogen-like initial and final states. 

Problem 204 Consider a transition between the hydrogen states |3,2,0) and 
11,0,0), which is dipole-forbidden, and find the quadrupole moment-mediated 
transition rate due to spontaneous emission for this transition. Compare it with a 
dipole-allowed transition between states 13,1,0) and 11,0,0). 



Chapter 16 

Free Electrons in Uniform Magnetic 
Field: Landau Levels and Quantum Hall 
Effect 


® 

Check for 
updates 


The Zeeman effect, which we discussed in some detail in Sect. 14.2, originates 
from the interaction between the magnetic moment of an electron bound to an 
atom and a uniform magnetic field. Experimentally this effect is often observed 
in atomic gases but can also manifest itself with bound electrons in semiconductors 
and dielectrics. What is important is that the quantum states of the electrons in all 
these cases belong to discrete energy eigenvalues. In metals and in the conduction 
band of semiconductors, on the other hand, the energy levels of electrons belong 
to the continuum spectrum, and in some instances, electrons can even be treated as 
free particles. The interaction between such unbound, almost free electrons and the 
uniform magnetic field results in some fascinating effects which had played and are 
still playing an important role in physics. 

Among older phenomena owing their existence to this interaction are the Hall 
effect (the emergence of an electric current in the direction perpendicular to 
the electric field), the de Haas-van Alphen effect (oscillations of magnetization 
of the metal with increasing magnetic field), and the Shubnikov-de Haas effect 
(oscillations of conductivity with the magnetic field). The Hall effect, which is a 
purely classical phenomenon, was discovered by American physicist Edwin Hall in 
1879. It is interesting to note that Hall first worked as a high school principal before 
embarking on Ph.D. studies in physics, which he did at Johns Hopkins University. 
The discovery of the effect, which now bears his name, was part of his doctoral 
work. He became a professor of physics at Harvard University in 1895, but did 
not produce anything even remotely as significant as his student discovery. Both 
de Haas effects owe their existence to the quantum nature of electrons and were 
discovered in 1930 in the laboratory of Dutch physicist Wander Johannes de Haas. 
Pieter M. van Alphen was at that time de Haas’ student, while Lev Shubnikov was 
a Soviet physicist working with de Haas as a visiting scholar. After his return to the 
Soviet Union, Shubnikov was falsely accused of espionage and in 1937 executed 
by the People’s Commissariat for Internal Affairs (NKVD)—Soviet analog of Nazi 
Gestapo (Secret State Police). 
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One of the most fascinating and fairly recent magnetic field effects discovered 
in the system of unbound electrons is the quantum Hall effect and even more 
exotic fractional quantum Hall effects. The former was discovered by German 
physicist K. von Klitzing in 1980, who showed experimentally that the so-called 
Hall conductance (I will explain what it means later, have patience) in a two- 
dimensional electron gas is independent of the geometry and details of the structure 
and is an integer multiple of a universal parameter e 2 /h (e is the fundamental 
charge, and h is the Planck’s constant without the bar). Plotted as a function of 
the magnetic field, the Hall conductance (or its inverse, called Hall resistance) is 
seen as a series of plateaus separated by finite jumps with a magnitude of e 2 /h 
(see Fig. 16.1, where the lower curve gives an example of the Shubnikov-de Haas 
oscillations in a two-dimensional system). 

The significance of this discovery was almost immediately recognized by the 
physics community, and von Klitzing was awarded the 1985 Nobel Prize in Physics. 

In 1982 Horst Stormer, Daniel Tsui, and Arthur Gossard, American physicists 1 
working at that time at Bell Labs, made an even more unexpected discovery—the 
Hall conductivity, which is a rational fractional multiple (as opposed to the integer 
multiple) of e 2 /h. This phenomenon was called a fractional Hall effect, and it still 


Fig. 16.1 Experimental 
curves for the Hall resistance 
and Ohmic resistivity of a 
heterostructure as a function 
of the magnetic field at a 
fixed carrier’s density at 
temperature T = 0.8 mK 
(from K. von Klitzing, The 
quantum Hall effect. Rev. 
Modem. Phys. 58(3) (1986), 
reprinted with permission) 



Stormer was born in Germany and got his Ph.D. in France, while Tsui was bom in China to a 
family of farmers, did undergraduate work at Lutheran Augustana College in Illinois, and got his 
Ph.D. at the University of Chicago. 
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defies full theoretical explanation. The only thing clear from the very beginning was 
that the model of non-interacting electrons, successful in the explanation of “normal 
quantum Hall effect,” is not applicable here. Just a few months after the experimental 
discovery, another American theoretical physicist Robert Laughlin pulled out of thin 
air an exotic many-electron wave function describing a new kind of ground state of 
interacting fermions which can only exist in two-dimensional systems. This ground 
state has a bunch of mysterious properties which can be interpreted by introducing 
particles with fractional charge and fractional (neither boson nor fermion) statistics. 2 
While Laughlin did not really derive his new wave function, but rather guessed it, 
the guess was so successful and generated so many new ideas and concepts that 
Laughlin together with Tsui and Stormer was awarded the 1998 Nobel Prize in 
Physics. 

In the foundation of all these phenomena, from the old de Haas-Shubnikov 
oscillations to the newest concepts of the fractional quantum Hall physics, lies 
Landau quantization—the emergence of quantized energy levels in the spectrum of 
an otherwise free particle interacting with a uniform magnetic field. In this chapter 
I will introduce you to the basic physics of Landau levels and will also scratch a 
little bit at the surface of the regular quantum Hall effect. However, even a cursory 
discussion of de Haas effects, especially of the fractional Hall conductivity, lies way 
outside of the scope of this book. 

Quantization of energy eigenvalues of a free electron under an influence of 
a uniform magnetic field was first predicted by Soviet physicist Lev Landau. 
He is probably one of the best-known Soviet physicists, famous for his seminal 
contributions to many areas of physics, and winner of the 1962 Nobel Prize in 
Physics for his theory of superconductivity. It is much less known that he was 
destined to repeat the fate of Lev Shubnikov and was saved only by a direct personal 
intervention of Pyotr Kapitsa—the most powerful Soviet physicist of that time, 
the discoverer of superfluidity, a Nobel Prize winner, and the director of the main 
physics research institute in the Soviet Union. Landau was arrested in 1938 for 
comparing Stalin’s and Hitler’s regimes and spent a year in the infamous NKVD 
Lubyanka prison. It took Kapitsa two personal requests and the threat of resignation 
to secure Landau’s eventual release in 1939. 


2 These are not real “particles” of course and are usually called quasiparticle. They present a 
convenient theoretical model useful for the description of ground state properties of strongly 
interacting electrons. 
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16.1 Classical Mechanics of a Charged Particle in Crossed 
Uniform Electric and Magnetic Fields 

To create an appropriate context for the story of the life of a quantum electron in the 
uniform magnetic field, I will begin by reminding you of the properties of classical 
electrons. 


16.1.1 Cyclotron Motion in the Uniform Magnetic Field 

A particle with electric charge q moving in the static magnetic field B with velocity 
v experiences the Lorentz force 


F = qv xB, 


which is perpendicular to both the field and the velocity. The resulting motion is a 
combination of a uniform drift with constant speed v\\ = v cos 9 in the direction 
of the field (9 is the angle between B and the velocity) and the circular motion 
with constant angular velocity Ql = v±/R in the plane perpendicular to the field, 
where v± = v sin 9 and R is the radius of the circular trajectory. The part of this 
statement referring to the motion in the direction parallel to B is rather obvious— 
there is no force component in B direction, but to feel at ease with the other part 
describing rotation in the plane perpendicular to B , you would have to dig out of 
your memory the long-forgotten concept of the centripetal acceleration. (I am trying 
to avoid having to formally solve equations of motion here because there is no fun 
in it.) So, centripetal acceleration is always perpendicular to the velocity changing 
its direction without affecting its magnitude and therefore “forcing” the particle to 
follow a circular path. Preservation of the magnitude of the velocity v can also be 
understood as a consequence of the fact that Lorentz force, which is perpendicular 
to the velocity, does not generate any work, making kinetic energy and, respectively, 
magnitude of the velocity, conserving quantities. This whole arrangement is shown 
in Fig. 16.2 for your enjoyment. 

The radius of the circle R is determined by Newton’s second law 


tn e v]_ 

R 


= qvB => 


R = 


m e v _l 


m e v sin 9 


qB 


qB 
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Fig. 16.2 A charged particle 
in the uniform magnetic field: 
velocity, trajectory, and the 
system of coordinates 



It is remarkable that the angular velocity Q L (frequency) of this motion, called 
Larmor 3 frequency 

0 v-l qB 

“L = IT = —- 
R m e 

is independent of the magnitude or direction of the particle’s velocity (this result 
must appear rather strange in the limit v± 0, when radius R goes to zero, and 
there is no circular motion, but it is what it is). This type of motion is often called 
cyclotron motion because it is exploited in a particle-accelerating device, called a 
cyclotron. 

Introducing a coordinate system with Z-axis along the field, I can describe the 
particle’s motion by projecting its position vector onto the coordinate axes: 


m e vsinO 


x = xq -\ -cos £2 L t 

qB 

(16.1) 

m e vsmQ . 


y = yo n sin Q L t 

qB 

(16.2) 

Z = Zo + v cos 6t, 

(16.3) 


3 Sir Joseph Larmor (1857-1942) was a Northern Irish physicist famous for the discovery of 
Lorentz transformations 2 years before Lorentz and 8 years before Einstein. He also discovered 
the effects of time dilation and length contraction but believed that they are real material changes 
in length rather than pure kinematic effects. He believed in ether and did not believe in relativity, 
both special and general. Still, he held a post of Lucasian Professor of Mathematics at Cambridge 
University, which was established in 1663(!), and was held before him by Newton and after him 
by Dirac. At the time of writing, this post is held by Stephen Hawking. 
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where Vo and yo are the coordinates of the center of the circular trajectory determined 
by initial conditions and zo is the initial position of the particle on the Z-axis. (Note 
that time-dependent terms in equations for v-coordinate and y-coordinate must have 
opposite signs since if one of the coordinates of the rotating vector increases, the 
other one must decrease.) Substituting t = 0 in these equations, you will find that 
yo is, indeed, equal to the value of y at t = 0. At the same time, the relation between 
Vo and v(0) is more complex: 


v 0 = v(0) -p±/qB, (16.4) 

so that the position of the center of the trajectory in the v direction depends on the 
component of the particle’s momentum in the plane perpendicular to the magnetic 
field. 

One important thing you should notice about these expressions is that coordinates 
in the plane normal to the field demonstrate the time dependence typical for a 
harmonic oscillator, while the motion along the field is the one of a free particle. 
You will see that this connection with the harmonic oscillator on the one hand and 
the free particle on the other hand persists even in the quantum description of this 
phenomenon. 


16.1.2 Classical Motion in Crossed Electric and Magnetic 
Fields 


To make life a bit more interesting, I will now, in addition to the magnetic field B , 
throw in a uniform electric field £ perpendicular to B. Assuming that the electric 
field is in the positive direction of the T-axis, I can write Newton’s second law 
equations, describing the particle’s motion in this case as 


ux 

qv ’ B ' 

d 2 y 

m e — = —qv x B + q£, 
at 1 

d 2 z 

= °- 


(16.5) 

(16.6) 
(16.7) 


It would be natural for you to assume that the electric field, which exerts a force 
and generates an extra acceleration parallel to the T-axis, would completely destroy 
the picture of the motion described in the previous section. Well, watch my hands. I 
begin by introducing a new coordinate 


X = X — V£t , 


(16.8) 
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where the constant parameter v E is yet to be determined. This change of the variable 
does not affect the equation for the v-component of the acceleration at all (after two 
differentiations the extra term with v E vanishes). However, substituting this relation 
to the equation for the y-component of the acceleration and taking into account that 
v x = dx/dt = dx'/dt + v E , I get 


d 2 y 

dt 2 


—qv' x B — qv E B + q £, 


where v' x = dx! I dt. Now prepare to be amazed: by choosing v E = £/B,l eliminate 
the electric field from the equations completely. The system of equations, Eqs. 16.5- 
16.7, expressed in terms of coordinates x',y, and z, looks exactly the same as if the 
electric field were not present at all. So much for the “extra” acceleration in the 
y direction! Consequently, I can use the expressions for the coordinates given in 
Eqs. 16.1-16.3, replacing v with x' and then using Eq. 16.8 to bring back the original 
v-coordinate. As a result, equations for y- and z-components do not change at all, 
while the equation for the v-coordinate becomes 

£ m e vsin6 

x = xq -\—t H-cos Q E t. 

B qB 

Ponder about this result a bit and enjoy its counterintuitive appeal: an electric field 
in the y direction generates a uniform displacement of charged particles in the 
perpendicular direction! You can understand this result qualitatively by noting that 
transition to the new coordinate x' is essentially the Galileo transformation to a 
new reference frame moving with speed v E in the v direction. Since the Lorentz 
force depends on the velocity of the particle, transition to the new reference frame 
generates an additional contribution to the force, which, given the right choice of 
the v E , cancels the electric force. 

Now, let me assume that a particle enters the region of electric and magnetic field 
in the direction perpendicular to B , allowing me not to worry about the particle’s 
displacement in the z direction. Also assume that instead of just one particle, there 
is a whole bunch of them but that I can neglect their mutual repulsion. Imagine now 
that I install a small particle detector of length d in the y direction and width w in 
the z direction (a bar in Fig. 16.3) perpendicular to the X-axis. 


Fig. 16.3 Schematic of 
electron motion in crossed 
electric and magnetic fields. 
Magnetic field (together with 
Z-axis) points out of the page 
















576 16 Free Electrons in Uniform Magnetic Field: Landau Levels and Quantum Hall Effect 

This detector counts the number of particles crossing it during some time r. 
Apparently this number is equal to the number of particles inside the volume limited 
by the detector in the Y-Z plane and by the distance the particles can travel during 
this time, which can be defined as 


a / x / \ £ m e v sm 6 

Ax = x(t + r) — x(t) = — x H-(cos \l L t — cos £2 L x) . 

B qB 


Now you need to note that as time r grows, the first term in this expression grows 
along with it, while the second term oscillates, remaining limited in value by the 
radius R of the cyclotron orbit. Thus, for time r larger than R/ve, you can drop the 
oscillating term and count the number of particles crossing the detector as 


£ 

N( r) = n q — x x d x w, 

B 


where n q is the number of particles per unit volume. Consequently, the total charge 
per unit time per unit area, which not quite accidentally is what people call the 
current density /, can be found as 


eN 

x x d xw 


B 


(16.9) 


Current density j described by Eq. 16.9 is unusual in many respects. First, it 
describes a current flowing perpendicular to the electric field—the phenomenon 
described in the beginning of this chapter as the Hall effect. To emphasize this 
unusual feature, we call the coefficient of proportionality between the current 
density and the electric field in Eq. 16.9 as the Hall (as opposed to the regular 
Ohmic) conductivity and use special two-index notation a^ for it: 


&xy ~ 


qn e 

~B' 


(16.10) 


One index in cr^ relates to the direction of the field and the other to the direction of 
the current. 

Another important feature of the Hall conductivity is that it has a finite value 
in the absence of any dissipation mechanisms which are absolutely necessary for 
establishing a regular stationary flow of charges. The dissipation processes, which 
are always present in real materials, of course modify the Hall conductivity as 


&xy — 


qn e 1 


B 


1 + 



2 ’ 


where x r is the characteristic time during which a particle loses a significant portion 
of its energy. Equation 16.10 emerges from the last expression in the limit x r -> oo. 
You will have a chance to derive this result as an exercise. 
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Finally, you might notice that the Hall conductivity presented by Eq. 16.10 is 
linear in the particle’s charge, while the standard Ohmic conductivity is proportional 
to q 2 . This is an immensely important difference because particles with charges 
of opposite signs would generate the Hall current flowing in opposite directions. 
By observing the direction of the Hall current, one can find out the charge 
of the current-carrying particles. Thanks to this quirk of the Hall conductivity, 
experimentalists were able to confirm a theoretical prediction that the electric 
current in semiconductors can be carried not only by negatively charged electrons 
but also by positively charged “holes” giving experimental legitimacy to this purely 
theoretical construct. 


16.2 Quantum Theory of Electron’s Motion in a Uniform 
Magnetic Field 

16.2.1 Landau Quantization 


A quantum theory of electron’s motion in a magnetic field begins as any quantum 
theory—with a classical Hamiltonian for the same problem. The Hamiltonian that 
have been introduced in Sect. 15.3.1, Eq. 15.53, is of a general enough form so that 
I can use it here with minimal alterations. Since now I am dealing with static fields, 
the vector potential appearing in Eq. 15.53 is time-independent and can, therefore, 
define only the magnetic field. A static electric field has to be described by a 
traditional scalar potential £ = — VV, which can be added to the Hamiltonian as a 
potential energy. I will need to introduce the electric field only later in the section on 
quantum Hall effect, but what I need right now is the term describing the interaction 
between the magnetic field and the spin of the electron. 4 Consequently, I can present 
the Hamiltonian of a free electron in a static magnetic field as 


H = 


| p+_eA(ryf V±b-S 

2m e 8 ft 


(16.11) 


where I took into account that electrons have the negative charge —e. The last term 
in this expression is the spin Hamiltonian, defined in Chap. 9, Eq. 9.28, with a slight 
modification: I replaced factor 2 in this Hamiltonian with a more generic quantity, 
the so-called spin g-factor. This factor determines the relation between the spin 
operator and the electron’s magnetic moment and is equal to 2 only approximately. 


4 When considering the interaction of an atom with electromagnetic wave as in Sect. 15.3.1, 1 did 
not have to worry about the interaction between the magnetic field of the wave and the spin because 
normally such an interaction would be extremely weak. In the case considered in this chapter, the 
magnetic field can be strong enough to make this interaction relevant. 
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More accurate relativistic calculations first performed by American physicist 
Julian Schwinger in 1948 gave for this factor the approximate value 2.002319. New 
York City residents might be thrilled to find out that Schwinger, who is considered to 
be one of the most important theoretical physicists of the twentieth century, was born 
in the City and attended Townsend Harris High School located about 50 meters from 
my office in the Science Building at Queens College. He started his undergraduate 
education at City College of New York (Queens College has not yet existed at that 
time) but transferred later to Columbia University. Experimental verification of the 
value of the g-factor, which was found to agree with the theoretical prediction with 
an astonishing accuracy, was one of the triumphs of quantum electrodynamics. 

Now I can explore how quantum effects change the story of the electron’s life 
in the magnetic field. To do that, I need, first of all, to choose a form for the vector 
potential which would reproduce the uniform magnetic field in the direction of the 
Z-axis of my preferred coordinate system. It is important to realize that the choice 
is not unique: there exist infinitely many different vector potentials generating the 
same magnetic field (in science speak it is called gauge invariance). However, not 
all vector potentials are created equal, even if they do reproduce the same field and 
describe the same physical reality. Some are more convenient to work with, others 
are not so much, and different potentials, while providing equivalent descriptions 
of the world, might emphasize its different aspects. Here I will choose the vector 
potential in the form 


A = r y xB 


(16.12) 


called Landau gauge , where r y is the unit vector in the directions of the Y-axis. This 
expression describes a vector potential with a non-zero y-component (which is its 
sole non-zero component) linearly dependent on the v-coordinate and independent 
of the y-coordinate. Of course, there is no inherent preference of one coordinate in 
the plane perpendicular to the magnetic field over the other, and I could have chosen 
a potential with a sole non-zero component in the x direction, which would depend 
on the y coordinates, but it will not really change the description to any significant 
degree. 

Substituting Eq. 16.12 into the Hamiltonian, I get 


' H = [P + eryxBf = P^ + P^ + [Py + exBf 

2m e 2 m e 2 m e 2 m e fi z 


(16.13) 


You can easily check that this Hamiltonian commutes with operators S z , p y , and p z 
(but not with p x because of the v-dependent term in the Hamiltonian). This means 
that the eigenvectors of the Hamiltonian, which needs to be considered as spinors 
defined in the tensor product of orbital and spin spaces, must have the following 
form: 


W) = \Py)\Pz) K) I <P)> 


(16.14) 
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where \p y , z ) are eigenvectors of the corresponding momentum operators, \m s ) is 
an eigenvector of S z with m s being the magnetic spin number, and \cp) is the 
only unknown vector state, which needs to be determined from the corresponding 
Schrodinger equation. Substituting Eq. 16.14 into 


_ p |_ + p\_ + [py + exB f 

2 m e 2 m e 2 m e 


s T bs ' 


»}.=m) 


I obtain the equation for the remaining unknown vector \cp) 


pj [Py + exB f 

2 m e 2 m e 


I V) 


( 1 

I E- -gfi B m s B- 



\<p). 


(16.15) 


where eigenvalues p y , p z , and fim s /2 replace the corresponding operators. Factoring 
eB out of the respective term in the Hamiltonian, recognizing that e 1 2 B 2 /m e can be 
rewritten as m e £2 2 , and introducing the notation 


x c = - P y /eB , 


(16.16) 


I can rewrite Eq. 16.15 as 


2 m e 


1 

2 


m e £l 2 L (x 



\<P) 


( 1 

I E ~ - g^B m sB 



(16.17) 


Do you remember I promised you that connection to the harmonic oscillator 
problem found in the classical case will persist in quantum treatment as well? So, 
here it is, consider the promise fulfilled: Eq. 16.17 describes the harmonic oscillator 
problem with frequency mass m e , but with the zero point of the potential shifted 
by x c (see Chap. 7). In the position representation, this shift can be easily corrected 
by introducing the new coordinate x' = x — x c which obviously does not change 
the eigenvalues and allows to use the well-known wave functions representing 
the eigenvectors expressed in terms of the new coordinate. Using the results from 
Chap. 7, 1 can write now that 


1 p 2 

E n,m s ( Pz) -g^ B m s B — — = 

2 2 m e 


( n+ 0 


E n,m s ( Pz) = h&i 


( , 1 gPnm e \ P 2 Z 

y l+ 2 + ^hr ms ) + 2 ^ 


(16.18) 


where I replaced eB/m e with Energy eigenvalues are now determined by 
one continuous number p z and two discrete indexes n and m s = =bl. They (the 
eigenvalues) can be thought of as a collection of parabolic bands formed by a 
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Fig. 16.4 Energy 
eigenvalues of the electron in 
a uniform magnetic field. 
Each parabola represents a 
Landau “band” corresponding 
to a particular value of n. The 
units for both energy and p z 
are chosen arbitrarily, so do 
not pay any attention to the 
actual numbers on the axes. 
The point of this figure is to 
show the general qualitative 
trend in the behavior of the 
Landau bands 



continuously changing p z , each corresponding to a particular value of n and one 
of two values of the spin magnetic number m s (Fig. 16.4). Due to dependence of the 
energy on p z , direct manifestations of the Landau quantization in the spectrum of the 
system are washed out. Indeed, for any arbitrary chosen energy value, you can find 
multiple states belonging to it (see Fig. 16.4) so that the allowed energy values are 
continuously distributed despite Landau quantization. The spectrum of the system 
can become truly discrete only if the electrons are confined to a two-dimensional 
plane (which experimentalists can routinely do these days), so that the z-component 
of the momentum p z vanishes from Eq. 16.18. The complete discretization of the 
electron’s energies in this case is the main reason why all these exciting quantum 
phenomena that I mentioned earlier occur only in a two-dimensional electron gas. 
Washing out of the discrete structure of the energies does not mean, however, 
that Landau quantization does not have any interesting consequences even in the 
system of regular three-dimensional electrons. One can observe its traces in such 
phenomena as de Haas-van Alphen and Shubnikov-de Haas effects. I will return to 
this issue later, after I will have finished with the energy levels E n , ms (p z ). 

If we are dealing with truly free (not interacting with anything but the magnetic 
field) electrons so that the mass term m e in the kinetic energy part of the Hamiltonian 
is just the genuine electron mass, then the factor fie/m e is exactly twice of the Bohr 
magneton pb B , so that Eq. 16.18 simplifies to 

E n ,m s ( Pz ) — [n + - + (16.19) 

If in addition you neglect the deviations of the g-factor from 2, you end up with 
energy levels presented by the even simpler expression 



/ x , ( m s -b 1 \ 

E n ,m s ( Pz ) =hQ L \n+ J 


(16.20) 
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Since m s only takes two values 1 and — l, (m s + 1) /2 can only be equal to 0 (if 
m s = — 1) or 1 if m s = 1. The immediate consequence of this is that the energy band 
corresponding to a given n and m s = 1 is exactly the same as the band corresponding 
to n +1 and m s = — 1: E n ^ = E n + making each Landau band double degenerate. 
Here I decorated the lower index in the notation for energy with up and down arrows 
instead of numerical values of m s , 1 and —1, correspondingly, to make the notation 
visually more striking. The only band, which remained non-degenerate, is the lowest 
one corresponding to n = 1, m s = — 1 and described by the energy = p 2 z /2m e . 
This expression looks very much like the energy of a one-dimensional free particle, 
and if you wish to think of this as of exact cancelation of the magnetic forces on the 
electron due to its orbital motion and its spin, you can do that, but do not take this 
analogy too seriously and too far. 

The situation changes, however, if you are dealing with electrons in real metals or 
semiconductors, where they are not quite free as they interact with periodic potential 
of positively charged ions. Most of the current understanding of semiconductor 
physics is based on the idea that the interaction with this potential can be accounted 
for (approximately, of course) by simple replacement of the actual electron’s mass 
m e by an effective mass m e ff (I have already mentioned that in Sect. 15.3.3 on optical 
transitions in semiconductors). Usually, effective mass is much smaller than the 
mass of a free electron, for instance, in GaAs —one of the most studied and used 
semiconductors m e jj = 0.067m e . This effective mass replaces m e in the kinetic 
energy term of Eq. 16.11, but it does not affect the expression for Bohr magneton 
in the spin-related contribution for the Hamiltonian. You might wonder why we are 
treating different parts of the Hamiltonian differently and if it can be justified in any 
way. Actually, there is a good reason for doing this, and while a rigorous explanation 
of this fact would be a bit over your heads, I can offer a hand-waving argument 
in favor of this approach. The electron’s mass in the kinetic energy describes the 
reaction of the electron’s orbital motion, characterized by the de Broglie wavelength, 
with ions. This wavelength typically covers several hundred ions, so that electrons 
feel their potential averaged over large distances, and the effective mass emerges as 
a result of this averaging process. Spin, and the associated magnetic moment of the 
electron, on the other hand, is a local characteristic of the electron not subjected to 
any averaging procedure and remains, therefore, the same as for a free electron. So, 
if I present the effective mass as m e jf = Ktn e , I can rewrite Eq. 16.18 as 



(16.21) 


The degeneracy between different Landau bands has all but vanished, and the spin- 
related separation between bands with the same n is equal to 


E n ,\ ^ 


where I approximated g with 2. For GaAs this separation is more than ten times 
smaller than the separation between the bands originating from Landau levels with 
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adjacent values of n , so that Landau bands in this case appear in closely positioned 
pairs corresponding to the bands with the same n but different orientation of the 
spin. 

Before continuing, let me take a step back and make a detour into an idealized 
world of a genuinely free two-dimensional electrons with g-factor exactly equal to 
2, described by Eq. 16.20 without the term p 2 z /lm e . This model, while not quite 
realistic, is still a useful toy model which allows you a half glimpse into some of 
the coolest ideas of modem physics in a relatively safe and unintimidating setting. 
The main remarkable feature of this model is that its Hamiltonian can be presented 
in the following form: 


H=Q 2 , 


(16.22) 


where 


Q = 


[& X P X + Oy (Py + eBk)] . 


Operators here are Pauli matrices introduced in Eqs. 9.15-9.17. To make it 
easier, I will remind you those of their properties, which I need in order to continue: 
(1) the square of any Pauli matrix is a unity matrix, & xy z = /, and (2) different Pauli 
matrices anticommute, & x & y + & y & x = 0. To prove Eq. 16.22, you just need to square 
Q and use these properties of the Pauli matrices along the way: 

Q 1 = 2 ^ \_XPl + °y (Py + eB xf + 
a x OyP x (p y + eBx) + cj y cj x (py + eBx) p x ] = 

+ + i) 2 + 

eB (d x d y p x x + o y o x xp x )\ = 

l r 2 i 

-— p 2 x + (p y + eBx) - ifieBa x a y . 

Ztfl e L 

Here terms containing commuting operators p x p y vanish thanks to anticommuta¬ 
tivity of the Pauli matrices, which is also responsible for generating the canonical 
commutator [x,p x ] = ifi in the term mixing momentum and coordinate operators. 
Finally, you can verify by direct calculations that & x & y = io z , and replacing (;fi / 2) & z 
with the spin operator S z , you will complete the proof. 

Operator Q deserves a more close inspection because its eigenvectors are also the 
eigenvectors of the Hamiltonian, while the square of its eigenvalues are supposed to 
yield the energy levels. Recalling the actual form of the Pauli matrices, I can rewrite 
Q as a two-by-two matrix: 
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o p x l , I" 0 -i (py + eBx)~\ \ _ 

b x 0 i (p y + eBx) 0 J 


1 _I" 0 Px — i (Py + eBx) 

bn e p x + i (Py + eBx) 0 


The combination of operators arising in the off-diagonal elements of this matrix 
looks somewhat familiar, so it makes sense to take a look at their commutator: 

[px + i (py + eBx) ,p x — i (py + eBx)] = 

2 i (py + eBx) p x - 2ip x (p y + eBx) = 

2ieB (xp x — p x x) = —2tieB. 


The important result here is that the commutator is just a regular constant and not 
an operator. Taking advantage of this fact, I can now define new operators 


Px - i ( Py + eBx) 
y/2fiqB 


(16.23) 


Px + i (py + eBx) 
yflhqB 


(16.24) 


which have exactly the same commutation relation as raising and lowering operators 
in the harmonic oscillator problem: [a, 2^] = 1. Isn’t it curious how the ears of 
the harmonic oscillator problem are sticking out everywhere we look? In fact, it is 
getting even curiouser and curiouser if you rewrite operator Q in terms of a and cP : 




(16.25) 


which yields the Hamiltonian in the form 


H = Q 2 = hQ L 


0 a] \ 0 a 


cP 0 S' 0 


, ^ aa' 0 , „ a'a +10 

n —n£2 L ^ 

0 a ] a 0 a ] a 


M- 10 , 10 

M2 t a ] a + 

V 0 100 


m L + n i] + \*Vl [j_°, 


(16.26) 
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The first term in this expression is a spin-independent Hamiltonian of the harmonic 
oscillator, while the second term is our familiar spin Hamiltonian in the magnetic 
field. This result ascertains the role of operators a and o' as lowering and raising 
operators which generate the excited states of the electron in the magnetic field 
from the ground state in exactly the same manner as they did for a regular harmonic 
oscillator. More specifically if \<p n ) and \(p n +i) are normalized eigenvectors repre¬ 
senting orbital components of states corresponding to nth and n + 1st Landau levels, 
one can write o' \cp n ) = V n + 1 \(p n +i) and a \(p n +i) = y/n + 1 \<Pn)- 

To get a bit more insight into properties of operator Q , consider a spinor 

" 0 ' 

>o)_ 

describing the zeroth Landau level with the “spin-down” spin state. Applying Q to 
it, you get 


Q 


o 

>o)_ 


y/ti&L 


0 a 

' 0 ' 

_fi t 0_ 

>o)_ 


y/ti&L 


a\(p 0 ) 

0 


= 0 


(the lowering operator acting on the ground state yields zero), which agrees with the 
previously obtained result that the state of the electron with n = 0 and m s = — 1 
in the magnetic field has zero energy. If, however, you consider the action of this 
operator on, let’s say state \(po) ), you will get 


1 1 

o| 

1_1 

G 

> 

II 

0 a 
_a f 0_ 

i-1 

of 

1_1 

G 

> 

II 

'a> 

o 

i_i 

II 

£ 

1 1 

f O 

1_1 


In English it means that operator Q flips the spin of the electron whereby lowering its 
energy by fiQi while simultaneously lifting it to the next Landau level increasing its 
energy by the same amount, thereby generating a pair of degenerate states discussed 
in Eq. 16.19. 

Now, if you recall the general discussion of the degenerate eigenvalues, you 
will remember that a degeneracy can often be associated with the set of mutually 
commuting operators. In the case under consideration here, it is the operator Q 
together with the Hamiltonian that forms such a set. An interesting peculiarity of 
this example, distinguishing it from all other examples of commuting operators, 
is that here operator Q mixes spatial-temporal and spinor segments of the total 
tensor product space. By itself it is not big news—you observed this phenomenon 
when we discussed spin-orbital coupling in Sect. 14.1. What is new here is that 
the mixing of orbital and spin degrees of freedom occurs in the absence of any 
spin-orbital interaction. This mixing is a primitive pedestrian version of something 
called supersymmetry: a symmetry transformation connecting particles with half¬ 
integer spins, the fermions, with integer spin particles—bosons. In our example, the 
harmonic oscillator represents the bosonic degree of freedom (remember, harmonic 
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oscillator lowering-raising operators appear also as creation-annihilation operators 
of photons which are bosons, Sect. 7.3), while the spin degrees of freedom play 
the role of the fermionic degrees of freedom. The eigenvalues and eigenvectors of 
operator Q can shed more light on the connection between orbital and spin states in 
this problem, and I hope you will have fun figuring them out by yourself. 

OK, the detour is over, I hope you enjoyed it, but let’s get back to the 
main business of this chapter. Based on Eq. 16.17, I can write down the position 
representation for the \cp) component of the eigenvector of my Hamiltonian, which 
is just a wave function of a harmonic oscillator centered at x = x c instead of zero: 


<Pn(x) = (x| <p n ) = 


yfl"n\^/n 


exp -- 


(x - x c ) 


2£ 2 


U, 


(^) 


where the characteristic length £ known as magnetic length is 


£ = 




(16.27) 


(16.28) 


The total wave function ifr ( r ) includes two free particle-like exponents 
exp (ip z z + ip y y) reflecting the free particle-like motion in the z and y directions. The 
corresponding position probability density \yfr (r)\ 2 looks like a slab with uniform 
probability distribution along the Y and Z axes while exponentially decaying for 
v > v c and x < x C9 Fig. 16.5. 

In addition to the trivial free particle-like dependence of the wave function on 
p y , it also shows another, much less trivial role played by this parameter, which, via 
Eq. 16.16, determines the position of the maximum of the distribution along the X- 



Fig. 16.5 Probability density \\jr (r)| 2 of the electron’s position corresponding to the oscillator 
ground state as a function of two coordinates, x and y. If you can imagine a plot in a four¬ 
dimensional space, it would show that the probability distribution behaves in the z direction in 
the same way as in the y direction. Two different surfaces in the figure correspond to two different 
values of p y , illustrating its effect on the center of the probability density 
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axis. This fact is reflected in Fig. 16.5 where two different plots correspond to two 
different values of p y , generating two different values of x c . 

In classical picture, this maximum would correspond exactly to the center 
of the particle’s trajectory, which you can see comparing Eqs. 16.4 and 16.16. 
It is important to note at this point that the energy eigenvalues do not depend 
on p y , indicating their significant degree of degeneracy. This degeneracy is what 
distinguishes an electron in the magnetic field from the simple harmonic oscillator 
problem. 


16.2.2 Degeneracy of the Landau Levels and the Density 
of States 

The degeneracy of the energy levels determines the number of single-particle 
orbitals available for constructing many particle states in a many-fermion system 
and is expressed in terms of the density of state—a notion which has been already 
discussed in the previous chapters in connection with the concept of Fermi energy 
(Sect. 11 .5) and optical transitions to (or between) states of the continuous spectrum. 
Our normal practice employed to compute the density of states consists in confining 
the particles inside a large but final volume and impose boundary conditions on 
the wave function. However, when dealing with Landau levels, this procedure 
acquires several unusual twists. First, while I can and should impose the periodic 
boundary conditions on the wave function in y and z directions, the wave function 
vanishes in the v directions, so no boundary conditions for the v-coordinate are 
actually required. As a result, the boundary conditions allow me to discretize the 
two continuously changing quantities: 

2 nfib z 

Pz = ~lT' 

U z 

Infiby 


where b XJ = ±1, ±2 • • •, while the third quantum number defining a state of the 
electron, integer n from the harmonic oscillator problem, is already discrete. In order 
to find how many states with distinct values of p y correspond to any given energy 
value of n , I need to find the restriction on the number of allowed p y . The periodic 
boundary conditions are not too helpful in this regard as they just make p y discrete 
but put no bounds on their values. Also, since the energy does not depend on p y , I 
cannot use our regular trick of looking for the number of states corresponding to a 
specified energy interval. 

To figure all this out, let’s take a look at what is happening to the system as 
the absolute value of p y increases. Equation 16.16 indicates that as p y increases or 
decreases, the center of the probability distribution in the x directions moves further 
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away from zero toward correspondingly negative or positive coordinates. However, 
in the system of a finite size, it cannot move too far because at some point you 
will be pushing the system outside of the quantization box, which is not allowed. 
This restriction sets a limit for the possible value of x c : x c < L x . 5 This is where 
the periodic boundary condition discretizing p y comes in handy, because it also 
automatically discretizes parameter x c , which now only takes values 

Infiby 

X c = --. 

L y eB 

The condition x c < L x limits the number N ntPz of the allowed states with given n and 
Pz as 


2 Ttflby 

L y eB 


< L x 


ljnax 

°y 


eBL x L y 
2 7tfl 


so that the p y -related degeneracy of the Landau levels is 


c BL X By 
2 nfi 


This expression is sometimes rewritten as 


G = 


0 

0 O ’ 


(16.29) 


(16.30) 


where 3>o = h/e (note the rare appearance of the Planck’s constant without the 
bar) is called fundamental quantum of flux, while = BL x L y is the magnetic flux 
through the region of area A± = L x L y . 

These expressions tell you how many states with given n and all possible p y exist 
in the system occupying the surface area A_ j_ normal to the magnetic field. If the 
particle’s motion is limited to a two-dimensional plane, then this is all what you need 
to know. In the two-dimensional case, which is of main interest in quantum Hall 
effect studies, the momentum’s component p z does not exist, the respective wave 
function loses the factor exp (ip z z/fi), and the spectrum of eigenvalues becomes 
purely discrete with energies given by the one-dimensional oscillator formula 


5 This condition is obviously just an approximation because the wave function in the x direction 
has finite size, and requiring that its center remains inside the box does not make the probability of 
the particle described by the harmonic oscillator wave functions to be outside of the box to vanish 
automatically. The choice of x c to define the number of states is, in this sense, rather arbitrary (you 
could choose x c + Ax, where Ax is the uncertainty of the coordinate or something else for that 
matter). However, the error which I make here is of the order of A x/L x and becomes negligibly 
small as L x grows. The count of the number of states derived by this heuristic method is actually 
confirmed by a more rigorous (and immensely more complicated) mathematical approach to the 
problem. 
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fl^L 


, 1 , g 

n + - + K-m s 


) 


(16.31) 


In the case of discrete levels, the concept of the density of states, which was designed 
to count states in the continuous spectrum, is not really needed because we can 
simply count all the discrete states belonging to a given degenerate level, and this is 
what Eq. 16.30 yields. 6 If you wonder where this degeneracy comes from, please 
recall that you are dealing with a two-dimensional problem which requires two 
quantum numbers to fully characterize a state. This second dimension manifests 
itself as a degeneracy of the harmonic oscillator energy levels with respect to the 
position of the minimum of the harmonic oscillator potential making each Landau 
level to comprise G number of states, the same for all levels. 

In the three-dimensional case, which might currently be out of vogue, but is 
of interest for more old-fashioned manifestations of Landau quantization, such as 
de Haas-van Alphen and Shubnikov-de Haas effects, there is an extra step I have 
to make. Taking into account the quasi-continuous distribution of p z , I am back 
to the density of state computation mentality trying to find the number of states 
within an energy interval A E between the values E and E + A E. To figure out 
the answer, I need to go back to Lig. 16.4 and pay attention to the two horizontal 
lines drawn in there. These lines delineate an energy interval A E that appeared in 
the question posed above. You can see that several Landau bands characterized by 
their corresponding distinct ^-numbers and spin indexes contribute their states to 
this interval. In order to find a contribution from a single band, consider 




A E, (16.32) 


where the factor of 2 accounts for positive and negative values of the momentum. 
The periodic boundary conditions tell me that the number of states A b z correspond¬ 


ing to a given interval A p[ n,ms ^ is 



6 One can associate a delta-function 8{E — E n ) with each level and the identity expression 
J2 n N n 8(E — E n ) as a total density of states. This identification makes sense if you integrate it 
over any final energy interval to get the total number of states in it. 
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which together with Eq. 16.32 yields 


A b ( f ms) = - L-s/lm,, AE. (16.33) 

2t xfiJE — fi£2 L (n + \ 

Apparently, this expression makes sense only if E — fiQ L (n + | + K^m s ) > 0, 
which cuts the number of bands that at a given E can donate their states at 

_ " E 1 g’ 

nmax ~ [m L 2 * 4 . 

where I took into account that for a given n , the higher lying band corresponds 
to m s = 1. The square brackets in this expression indicate that the number inside 
needs to be rounded down to the nearest integer. To account for the degeneracy in 
the plane normal to the field, the result in Eq. 16.33 must be multiplied by G. Adding 
together the contributions from all bands with n < n max , I can finally get the answer 
to my burning question: the total number of states AN with energies between E and 
E + A E is 


AN = 


4jt 2 fi 2 


L x LyL 


n max 

.EE 


n= o m s E — tiQ l (n \ + ^4 m s ^j 


A.E 


A more convenient version of this expression that does not require you to keep track 
of n max uses the step function 0 [E — tiQ L (n + f + /c|m^)], which automatically 
limits the summation to only those n for which its argument is positive, making the 
expression for the density of states more convenient for analysis: 


g(E) = AN/AE = 


e^/2m e B r r r ^ 
4jr 2 ti 2 L X L y Lz2^L 


0 \E — fiQ L (n + \ + ic&m s )\ 
(n+ \ +A:fm s ) 


(16.34) 


If, as is often the case, the spin-related term in the expression for energy can be 
neglected, the summation over the spin number m s simply yields an extra factor of 
two in the density of states. 


16.2.3 Fermi Energy of a Gas of Non-interacting Electrons 
in Magnetic Field 

Now I am going to use the results of the previous section to revisit the concept of 
the Fermi energy first discussed in Sect. 11.5 for a gas of free electrons. The Fermi 
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energy is the main characteristic of the many-electron ground state and as such 
affects the most practically important properties of electronic systems such as 
conductivity, heat capacity, magnetization, etc. It would be interesting, therefore, 
to see how the magnetic field and Landau quantization affect it. 

I will begin with a two-dimensional case, when all what you have to worry about 
are different harmonic oscillator states and their degeneracy. I will also neglect the 
spin contribution to the energy of Landau levels, in which case the total number of 
states at a given Landau level is found by multiplying Eq. 16.29 by two. Then, the 
total number of states in Landau levels with n < n max for a given magnetic field B is 
found as 


(jlrnax T" 1) 


eBL x Ly 

nh 


(remember, the first value of n is zero). If the number N e of electrons in the system 
obeys inequality 


eBL x Ly eBL x L y 

ftmax 7 ^ N e < ( n max + 1 ) — , 

7th 7th 

then the n max — 1 Landau level is the last one completely filled, and n max -th level is 
partially filled, so that the Fermi energy is 

Ef = ti&L (nmax + • (16.35) 

Introducing the filling fraction v (the fraction of the total number of single-electron 
orbitals actually used to form a many-electron state) of the n max -th level, I can relate 
the total number of electrons in the system to n max and v: 

, . eBL x Ly 

N e = {n max + v) -p. (16.36) 

Now, let’s see what will happen as I gradually increase the magnetic field. Two 
things are taking place simultaneously. First, Q L and, therefore, the energy of each 
Fandau level grow linearly with B , resulting in the initial increase of the Fermi level. 
At the same time, the degree of degeneracy of each Fandau level grows, so does their 
capacity, and so the number of filled single-electron orbitals at the last partially filled 
level decreases until the n max -th level becomes devoid of any electrons. When this 
happens, the last filled orbital now corresponds to the n max — 1-th level, resulting 
in a drop in the Fermi energy by h£2 L . Further increase in the field again produces 
growth of E F due to the growth of Q L , until the filling fraction v of the n max — 1- 
st level drops to zero, resulting in a subsequent drop in the Fermi energy again by 
hQ L . This pattern of linear growth accompanied by sudden drops repeats itself until 
the magnetic field becomes so large that you would only need states belonging to 
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n = 0 Landau level to accommodate all electrons in the system. This happens when 
B obeys inequality 


eBL x L y 

TC fl 


>N e 


or 


N e TCfl 

B > — -, 

L x Ly e 

which can also be recast in the form 


d> > N e <&o- 


This pattern can be seen in a more quantitative manner by solving Eq. 16.36 for 

^ max • 


ftmax 


N e TC fl 
L x L y eB 


where [• • • ] again means dropping the fractional part of the number inside the 
brackets. Then the Fermi energy can be written down as 


E f = 


fieB 


/[" N e Jtfll 1\ 
+ 2 )' 


(16.37) 


These expressions clearly show the two tendencies in the magnetic field dependence 
of the Fermi energy, which are also illustrated in Fig. 16.6. 

The Fermi energy in the three-dimensional case is obtained by integrating the 
density of states over all energies from 0 to E F , which results in equation 


Fig. 16.6 Magnetic field 
dependence of the Fermi 
energy for a two-dimensional 
electron gas. Magnetic field is 
in the units of O/O 0 , and the 
Fermi energy is in the units of 
hQ L 
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eJ2m e B 

* L x LyL z 
2jt 2 fi 2 x y z 


n=o{ ^E-ti£2 L (n + \) 


= N e . 


The integral in this expression is easily computed yielding 


e^/2m e B 

7t 2 fl 2 


L x L y L z 



N,. 


where I took into account that the step function in this expression effectively limits 
the lower limit of integration for each term in the sum over n to value fi^L (n + \). 
Also it is worth noting that after the integration over energy, you are essentially 
ending up with the sum of . This conforms with a well-known fact that in one¬ 
dimensional systems the magnitude of the particle’s momentum at a given energy E 
is equal to the total number of states with energies less than E. The computation of 
the resulting sum over n is not a trivial matter, and I will not attempt it here (I think 
I made you suffer quite enough even without it), but I would like to notice that the 
qualitative pattern of dependence of E F versus magnetic field is similar: the energy 
of the last filled level grows with B , while the filling factor of the respective bands 
decreases, until they become empty, and the Fermi level drops to the lower band 
corresponding to n — 1. Thus, again one shall expect the oscillatory dependence of 
the Fermi energy upon the magnetic field. 

I cannot feel completely satisfied unless the total energy of the ground state is 
computed, and I am going to do it here for the two-dimensional electrons neglecting 
the spin contribution to the energy of the Landau levels. In the system of non¬ 
interacting particles, the total energy is simply equal to the sum of the single-particle 
energies of all filled orbitals, so all what I need is a bit of bookkeeping. Taking into 
account Eq. 16.31 together with Eq. 16.29, 1 can write for the total energy 


E N g ‘ = 2Gti£l L 


ftmcix lx y \ / 

_£ (" + 5 ) + T " 



where the first term in the square brackets counts all the completely filled Landau 
levels up to n max — 1 and the second term accounts for the contribution of the 
partially filled n max level with the filling fraction v defined by Eq. 16.36. Recalling 
the formula for the sum of the arithmetic progression, I have 


Ef = 2GfiQ L 


^ftn 


: ( ft max 0 


+ 


ftmax ( 1 \ A 

— + V {n max + 2 ))- 


GhSl L I n 2 mnr + 2v I n max + - 1 1 = 


GfiSl, 


( n lmx + 2v ^ 

( n max + v ) 2 + V — V 2 J 
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Using Eq. 16.36 to express 


Umax “ 1 “ V 


7T flN e _ N e 
eBL x L y 2 G 


I can rewrite the total ground state energy as 


fi 

4 


uN e 


tiQ L N 2 


g 4 G 

eB IntrN 2 


+ fiv (1 — v) GQ L 
eB eBL x L 


m e eBL x L y 


+ fiv (1 — v) 


'x^y 


2 nh 


7tfl 2 N 2 e 

2m e L x L y 


+ v (1 — v) —- B 2 L x L y . 

2irm e 


(16.38) 


(16.39) 


The first term in this expression does not depend on the magnetic field and 
can be shown to be the ground state energy of the two-dimensional gas of free 
electrons (see the problem section in this chapter). If you are confused about the 
quadratic dependence of this contribution on N e (you shall rightfully expect a 
linear dependence in a system of independent particles), note the denominator of 
the respective expression contains L x L y —the area of the sample. Thus, the field- 
independent part of the total energy is actually proportional to N e o e , where o e is the 
surface concentration of the electrons, i.e., it is, indeed, linear in N e as expected and 
similar to the case of three-dimensional electrons without the field, Eq. 11.57. The 
magnetic field dependence of E^ e is again determined by two factors: oscillations of 
the filling factor between values 0 and 1 with period \/B (check out Eq. 16.38 and 
note the l/B factor) and a smooth increase in the energy as B 2 (see Fig. 16.7). The 
first tendency is a direct consequence of Landau quantization as it reflects changes 
in the occupation of Landau levels with increasing B , while the second tendency can 
be explained using the classical argument: the energy of a magnet in the magnetic 
field is a jiB , where the magnetic dipole, which is zero in the absence of B , is itself 
proportional to the field. 


Fig. 16.7 Magnetic field 
dependence of the ground 
state energy of the 
two-dimensional electron gas 
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The oscillations of the ground state energy manifest themselves in oscillations 
of immediately observable quantities such as magnetization (magnetic moment per 
unit area). Keeping only terms linear in the magnetic field, the magnetization can be 
found as 


1 e 2 

M = - dE g /dB = -v (1 - v)- B. (16.40) 

E%Ey TC Wig 

This result means that the two-dimensional gas of free electrons can be characterized 
by magnetic permeability /z, defined as a proportionality coefficient between 
magnetization and magnetic field 


fji — —v (1 — v) —— (16.41) 

n m e 

which exhibits periodic oscillations as a function of l/B. The negative sign of the 
permeability indicates that the gas of free electrons is a diamagnetic material, in 
which the magnetic field induces magnetization in the direction opposite to that of 
the field. The diamagnetic properties of free electrons were also discovered by Lev 
Landau in 1930. 


16.3 Quantum Hall Effect 

I will complete this chapter by a brief excursion into a huge area of the quantum 
Hall effect, which is a quantum version of motion of a two-dimensional electron in 
crossed electric and magnetic fields. From the onset I have to warn you that here I 
am going just to scratch a bit the surface of this field, and by no means you shall 
assume that this section contains anything even remotely resembling a complete 
theory of this phenomenon. What I am going to present here is just some very basic, 
albeit fundamental, underlying reasons for its existence. 

The Hall effect (classical or quantum), as you already know, consists in the 
appearing of an electric current in a crossed electric and magnetic fields in the 
direction perpendicular to both of them. Computing the electric current in honesty 
cannot be done within the formalism of the pure quantum mechanics considered 
in this book. It would require taking into account the temperature-dependent 
statistical distribution of the electrons among various single-particle orbitals and 
its dependence on the electric and magnetic fields. You will also have to introduce 
a variety of dissipative processes, without which a finite conductivity simply cannot 
exist (go back to Sect. 16.1.2 and Problem 1 in the exercise section of this chapter), 
and dealing with dissipative processes within the quantum formalism is very far 
from being a trivial task. All of this will take us into the realm of quantum kinetics, 
which is way outside of the scope of this text. Fortunately, for us, however, the Hall 
conductivity does need dissipation to acquire a finite value as you have already seen 
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in the classical treatment of the Hall effect in Sect. 16.1.2. As a result I can get 
away with neglecting the possible dissipative effects, which is, of course, a gross 
oversimplification but will give us at least some hints at the underlying physics of 
the quantum Hall phenomenon. If in addition I assume that the electrons are in 
their many-particle ground state (zero temperature), I can avoid dealing with any 
temperature-dependent complications. 

In this simplified treatment, the flow of the electrons can be interpreted in terms 
of the flow of the probability density. Indeed, consider an electron in a single¬ 
particle orbital described by a wave function i/f G (r), where cr is a composite index 
numerating the orbitals. The measured charge density of these electrons p G (r) is 
directly related to the position probability distribution \\jr G {r)\ 2 \ p G (r) = e \ xf/ G (r)\ 2 , 
and if there are many electrons occupying multiple states, the total charge density at 
a given point is given by p tot = e J2 G Wo(r)\ 2 - Taking into account the continuity 
equation for the probability density, Eq. 5.45, I can rewrite it in the form of the 
equation for the charge density 



cr 


where j G is the probability current density for a given orbital. Consequently, the total 
electric current density with contributions from all electrons in occupied single¬ 
particle orbitals can be found as 



(16.42) 


<7 


To lay down some groundwork and as a basis for comparison, I will begin by 
evaluating the current density j a in the absence of the electric field. It is quite 
tempting to just rush ahead and use Eq. 5.42 with the wave function given by 
Eq. 16.14 to do the job, and I wouldn’t blame if you did that, but unfortunately it 
would have been wrong. You shall realize it right away if you recall that Eq. 5.42 was 
derived using the Schrodinger equation with kinetic energy operator K = p 2 /2m e 
and velocity v = p/m e . In the presence of the magnetic field, however, the kinetic 
energy is given by 


£ = I P + eA(r)] 2 


2m e 


and the velocity operator is now 


I P + eA(r)} 


V = 


so I will have to re-derive the formula for the probability current density to reflect 
this new reality. 
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If, on a hunch, you replace the velocity operator in Eq. 5.43 with the new 
expression, the probability current density reemerges in the following form: 


j 


1 

2 m e 


[4/* ( r , t)(p + eA(r )) 4 / (r, t) + ^ ( r , t) [(p + eA(r)) 4 / (r, 0]*] • 

(16.43) 


As it turns out, it is a good guess, and this is indeed the correct expression for j. 
However, hunch or no hunch, to know that you have the right result for certain, 
you still have to derive it. This is done by quite literally repeating the same steps 
that resulted in Eq. 5.42, but with a new form of the kinetic energy. Ignoring any 
potential energy, which as you know cancels anyway, I start with 


3*(r,0 

• h ~3T 


-itiV + eA(r)Y 
2m P 


^ (r, t) 


3 W * (r,f) [itiV + eA(r)] 2 
-in --- = ---w * (r, t ). 


dt 


2 m e 


Expanding the kinetic energy term and ignoring the real-valued e 2 A 2 /m e , which will 
cancel in the same way as the potential energy, I have 


9\p fi ^ icfi 

itW*— = -q/*V 2 q> (r, t) -4/* [V(A^)+AV^] 

at 2m e 2m e 

au/* jpfi 

= -q/V 2 '!'* (r,t) -\ -q* [V (AqC) + AVq'*l. 

at 2 m e m e 

You might want to note that unlike the similar calculation in Sect. 15.3.1 I did not 
use the fact that for the vector potential in the Coulomb gauge (V -A = 0) V (AT') = 
AVtF, keeping these two terms as they were. Subtracting the second line from the 
first, I get 


3|4 / (r,/ L )| 2 ifi r t *_ 9t _^ o,", 

_ 1 _ v ! 71 —_r_q>*v 2 q> + q/V 2 ^ 1 — 

3 t 2m e ^ ^ 

— [W*V (AW) + W*AVW + WV (AW*) + WAVW*]. 
2m e L 


The first line here is the same as in Eq. 5.42 and can be written down as 


_V • (qCVvp _ vpvq'*) 

2m e v 1 ’ 
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and in the second line, as you can easily see by reverse engineering, the result is 
equal to 


—— V • + 'M'l'*). 

2 m e v 

Combining these two expressions, I again end up with the continuity equation 

9|^(V)| 2 = _ v . 

dt 1 

with the probability current now defined as 


ifi ifi € € 

j = -q/*Vvp +-+- VAV* = 

2 m e 2 m e 2 m e 2 m e 

— [** (-ifiV + eA) V + V (ifiV + eA) «I'*1. 

2 m e L J 

Replacing —ifiV with p and z'/zV'E* with (pty*, you can immediately see that this 
result reproduces Eq. 16.43. 

Now you can take a few minutes to feel proud about your amazing hunch and 
your power of intuition, but do not get carried away with it. Besides, the main work 
is still ahead of you: the goal was to compute the probability current in the absence 
of the electric field and with the magnetic field presented by the vector potential in 
the Landau gauge, Eq. 16.12. The expressions for the components of current density 
vector in this case become 


Jx — 


—ifi 
2m e 


■ 9^* 

_q>_ 

dx dx 


If / 3qr \ ( 


For the wave function in the form 


aq/* 

ifi — -1- eBx q^ 


dy 


V ( r , t) = 




-iEnt/h e ip y y/h (pn ^_ Xc ^ 


(16.44) 

(16.45) 


where E n is the energy of the corresponding Landau level, while (p n (x — x c ) is the 
wave function of the displaced oscillator as defined by Eq. 16.27, with x c given 
by Eq. 16.16, the v-component of the current density obviously vanishes, thanks to 
“real-valuedness” of cp n (x — x c ). As for the y-component, you can easily get 


eB 


jy(n,Py) = —— (py + eBx) \(p n (x-x c )\ 2 = —— (x-x c ) \(p n (x-x c )\ 


m e L y 


m e L y 


(16.46) 
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where in the last step I used Eq. 16.16 for the center of the wave function. So, in 
the absence of the electric field at each value of coordinate x, there exists a non¬ 
zero density of the probability current. The direction of this current changes sign 
however when the coordinate x changes from x < x c to x > x c . We are interested 
in the total current, though, which is normally defined as an integral of a current 
density over a surface perpendicular to it. In the two-dimensional case, however, 
the surface should be replaced with a one-dimensional line, which in the case under 
consideration is just the X-axis. The total electric current, /, is then defined as 


oo oo 



—oo —oo 


and is proportional to the expectation value of the coordinate x = x — x c in the 
| p y ) | (p n ) state of the electron. Obviously, this expectation value is equal to zero, so 
that the total current in the absence of the electric field vanishes, as expected. 

Now, let’s add the uniform electric field £ = t x £q into the picture by going back 
to Hamiltonian 16.11 with scalar potential V(v) = —x£q. Since this potential does 
not depend on the y-coordinate, the separation of the wave function into the product 
\p y )\(Pn) still works with the equation for (p n (x), now taking the form of 



Obviously, this equation is a simple rehash of Eq. 16.17 with the addition of the 
scalar potential term —eV(x) and a slight obvious change of notation. To get a handle 
on this equation, let me play with the potential energy term: 




where I defined 


e£o TYi e £o 


(16.47) 



Thus, it is still the same old harmonic oscillator problem 



) <Pn(x), 
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where the electric field does two things: (1) it shifts (again) the zero of the quadratic 
potential to new position %, and (2) it changes the energy of the Landau levels to a 
new value defined by 

E„ = X -m e Dr L (x 2 c -7 2 ) + tiQ L (n + ^ . 

Using Eqs. 16.16 and 16.47, you can find 

= 

e 2 B A ) 

g 1 »!(,£() 7)Py 1 Hle£() 

oXc ~2~B r= B 2 
Then the expression for the energy levels becomes 

, ( 1 \ £oPy 1 m e £% 

E n =m\n +-J+ _ + (16.48) 

While the constant m e £l/2B 2 term in this expression can be eliminated by a 
different choice of zero of the electric field potential, the term proportional to p y 
is physically significant. It indicates that the electric field lifts the degeneracy of 
the Landau levels with respect to p y and turns them into bands. The dependence on 
p y means that the electrons are now propagating in the y direction with the group 
velocity 


-m e Q. 2 L (x 2 . - x 2 ) = y 


1 e 2 B 2 

2 m P 


2x ( 


m e £o 

~e¥ 


v D = 


dEn 

9 Py 


£o 

B 


(16.49) 


which coincides with classical velocity v E characterizing the drift of the center of the 
classical cyclotron orbit. Apparently, Eq. 16.49 represents a quantum reincarnation 
of that classical result. It is also interesting that the dependence of the energy upon 
momentum in this case is linear instead of the expected quadratic one. For one, it 
means that the phase velocity of these electrons coincides with their group velocity, 
and the packet formed by the corresponding wave functions would not broaden 
(unlike the case of free electron packets). Essentially, these electrons behave as light 
with the exception that their energy does not go to zero when p y = 0. 

This result, while curious in itself, is of little significance for the calculations of 
the electric current. Equations 16.44 and 16.45 are not affected by the electric field 
(remember, any real-valued potential vanishes from the definition of the probability 
current), so the only change I need to incorporate comes from the wave function. 
Since the v-dependent part of the wave function is still real-valued, the v-component 
of the current still vanishes, while Eq. 16.46 for the y-component now acquires the 
form 
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jy (n,Py) = -— (x - X c ) \(p n (x -x c )\ . 

m e L y 

Now I want you to pay attention: the center of wave function in the presence of the 
electric field is shifted from zero by %, while the ( x — x c ) factor in it still contains 
the old shift x c . This simple circumstance makes all the difference, because now the 
total current does not vanish: 


e z B 

4 ^ _ ^Ly 


OO 

j M*-= 


e z B 

m e L y 


oo 2 

/ dx (x -%) \<p n (x -%)| 2 H-— (x c - x c ) / dx \(p n (x -%)\ 2 

J m e L y J 


The first term in the second line of this expression is still zero, of course, but the 
integral in the second term is just the normalization integral and is equal to unity. 
Thus, I have for the total electric current in the y direction, perpendicular to the 
electric and magnetic fields: 

e 2 B m e So eSo 
n,Py ~ ~ m e L y eB 2 “ ~BL y ' 

What is important here is that I ntPy does not depend on either p y or n , and, therefore, 
in order to find the total current as defined in Eq. 16.42, I simply need to multiply 
this result by the total number of states occupied by electrons. The contribution from 
a single Landau level to the current is 


f/7 — In, Py G — 


eSo eBL x L y 
BL V 2jtfi 


= ~£o L x , 
h 


and now if there are exactly n filled Landau levels, the total current becomes 


hot — 77 SqL x . 

h 

SqL x is obviously a potential difference across the sample in the direction of the 
field, and expression I tot /£ 0 L X is the quantized Hall conductance a xy of the system 


G xy = -n e -. (16.50) 

h 

You might feel a bit puzzled as why I did not multiply the degeneracy factor G by 
two as I did when deriving the density of states or the ground state energy. The 
thing is that even if the spin contribution to the Landau level is relatively small, 
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at low temperatures, when the quantum Hall effect is observed, the Landau levels 
corresponding to different spin orientations remain well separated from each other, 
so that each individual level does not have the spin-related degeneracy. But, when 
counting the total number of filled levels, the spin contribution must be counted so 
that n in Eq. 16.50 is not the same n as in Eq. 16.31 for Landau levels. (I am sorry 
but I have a limited number of symbols to be used in the formulas.) This n counts 
all filled Landau levels, including those with different values of the spin number m s . 

As the magnetic field shifts the position of the Fermi energy such that instead 
of n filled Landau levels, there are only n — 1, the conductivity changes abruptly 
by a universal amount e 2 /h in complete agreement with the experiment. What this 
simple calculation does not explain is why the conductivity does not change when a 
Landau level is only partially filled and the number of the occupied states changes 
with the magnetic field. In other words, this theory explains the quantum jumps of 
the conductivity, but does not explain the plateaus observed in the experiment. But 
this is a different story for a different time and a different book. And there are plenty 
of those if you are interested. 


16.4 Problems 

Problem 205 Consider a charged particle moving in uniform and time-independent 
electric and magnetic fields which are perpendicular to each other. The particle is 
also subjected to a damping force which can be written down as 



X r 


where m e is, as always, the particle’s mass, x r is the relaxation parameter, and v is 
the particle’s velocity. 

1. Write down the equations of motion for this particle including the damping force. 

2. Do not attempt to solve these equations in general case, but see if there is a 
solution with time-independent velocity v (set dv/dt = 0). 

3. Now imagine that you have N e electrons per unit volume, and using the found 
solutions, derive the expressions for normal Ohmic conductivity (current in the 
direction of the electric field) and Hall conductivity (current perpendicular to 
both electric and magnetic fields). (Hint: Compute the corresponding current 
densities first.) 

Problem 206 Using Eqs. 16.1 and 16.2, show that the center of the classical 
cyclotron trajectory, v 0 and yo, obeys the following relations: 


Xo = X - — (py + eAy). 
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Problem 207 Assume that instead of the vector potential given in Eq. 16.12, you 

use the vector potential of the form A = — r x yB : 

1. Show that this vector potential describes the same magnetic field as the one used 
in the text of the chapter. 

2. Write down the Schrodinger equation for the electron in the magnetic field using 
this vector potential and determine the eigenvalues and find the wave functions 
representing the eigenvectors of the corresponding Hamiltonian. 

3. Derive the expression for the degree of degeneracy of a single Landau level using 
this new vector potential. 

Problem 208 Consider operators 

n.v = p, K + eA x 
= p y + eA y , 

where A XJ are x- and y-components of an arbitrary vector potential defining a 

magnetic field B. Show that the commutator ft*, EM is equal to 


nr, U x 


—ifieB z , 


where B z is the z-component of the magnetic field represented by the vector potential 
A. (Hint: Use the position representation for the momentum operator.) 

Problem 209 

1. Using operators U x y introduced in the previous problem, construct the following 
operators: 


VMB( Uy + iIlx ) 

and find their commutator [a, 

2. Assuming that A z = 0, present the Hamiltonian given by Eq. 16.11 in terms of 
operators a and cfi . Comment on the result. 

Problem 210 Find the total ground state energy of a two-dimensional gas of free 
electrons (introduce the periodic boundary conditions and find the density of states; 
using Pauli principle, find the Fermi energy and the total energy of the many- 
electron ground state). 

Problem 211 Generalize Eqs. 16.39,16.36, and 16.39 taking into account the inter¬ 
action of the spin of the electrons with the magnetic field. (Hint: The degeneracy 
with respect to p y does not change in the presence of the spin, but the expression for 
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G in Eq. 16.29 must now be divided by two since spin variables are now counted 
explicitly. While the energy levels are now labeled by two numbers, you can still 
organize them in the ascending order and assign a single number to each energy.) 

Problem 212 Find the eigenvalues and eigenvectors of operator Q defined in 
Eq. 16.25. Hint: Consider a superposition of a spinor corresponding to Landau 
level n — 1 and spin up with a spinor corresponding to Landau level n and then 
spin down (both these states correspond to the same eigenvalue of the Hamiltonian 
equation 16.11). 

Problem 213 Find an alternative expression for the potential describing the uni¬ 
form electric field in the quantum Hall problem which would eliminate the term 

1 m e El 

2 B 2 

in Eq. 16.48 for the Landau energy levels in the presence of the electric field. 
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